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Foreword 


In the past, the Springer Series in Synergetics has consisted predominantly of 
conference proceedings on this new interdisciplinary field, a circumstance dictat- 
ed by its rapid grawth. As synergetics matures, it becomes more and more desir- 
able to present the relevant experimental and theoretical results in a coherent 
fashion and to provide students and research workers with fundamental “know- 
how” by means of texts and monographs. 

From the very beginning, we have stressed that the formation of spatial, 
temporal, or functional structures by complex systems can be adequately dealt 
with only if stochastic processes are properly taken into account. For this reason, 
I gave an introduction to these processes in my book Synergetics. An Introduc- 
tion, Volume 1 of this series. But research workers and students wanting to 
penetrate the theory of these processes more deeply were quite clearly in need of a 
far more comprehensive text. This gap has been filled by the present book by 
Professor Crispin Gardiner. It provides a solid basis for forthcoming volumes in 
the series which draw heavily on the methods and concepts of stochastic pro- 
cesses. These include Noise-Induced Transitions, by W. Horsthemke and R. 
Lefever, The Kinetic Theory of Electromagnetic Processes, by Y. L. Klimonto- 
vich, and Concepts and Models of a Quantitative Sociology, by W. Weidlich and 
G. Haag. 

Though synergetics provides us with rather general concepts, it is by no 
means “art pour l’art”. On the contrary, the processes it deals with are of funda- 
mental importance in self-organizing systems such as those of biology, and in the 
construction of devices, e.g., in electronics. Therefore I am particularly pleased 
that the present book has been written by a scientist who has himself applied — 
and even developed — such methods in the theory of random processes, for ex- 
ample in the fields of quantum optics and chemical reactions. Professor 
Gardiner’s book will prove most useful not only to students and scientists work- 
ing in synergetics, but also to a much wider audience interested in the theory of 
random processes and its important applications to a variety of fields. 


H. Haken 


Preface to the Corrected Printing 


Since I started writing this book ten years ago, a great deal has happened. I have been 
gratified to find how popular my exposition has become, and of course continually 
bemused that errors still come to light. I am very grateful to all those who have 
pointed them out to me, in particular to Matthew Collett, Scott Parkins, and Andrew 
Smith, who, as students and colleagues, over the last five years have kept me aware 
of everything they noticed. As well, I must also thank Prof. Urbaan Titulaer and Mr. 
Alexander Kainz, of the Johannes Kepler University of Linz, who sent me a very full 
and careful list of corrections. 

As a consequence a number of corrections have been made in this second printing 
of the second edition. The most significant of these is the removal of the converse 
result of Sect. 3.7.3b, which was incorrectly derived, and which is probably not true. 

At this time I must also express my thanks to my wife Helen May and my youngest 
daughter Nell, who have been of such support in the years since this book was written. 


Pasadena, California C. W. Gardiner 
October 1989 


Preface to the Second Edition 


In this edition I have corrected a number of misprints, and made a few altera- 
tions of a more substantial kind. In particular, I have rewritten Sections 4.2.3 
and 4.3.6, using a more correct definition of the Stratonovich stochastic integral; 
I have clarified a slightly confusing exposition on boundaries in Section 5.2.1e; 
and I have rewritten Sections 6.3.3 and 6.4.4c to take account of recent progress 
in these fields. I have also slightly augmented the bibliography and references. 


Pasadena, California C. W. Gardiner 
March 1985 


Preface to the First Edition 


My intention in writing this book was to put down in relatively simple language 
and in a reasonably deductive form, all those formulae and methods which have 
been scattered throughout the scientific literature on stochastic methods through- 
out the eighty years that they have been in use. This might seem an unnecessary 
aim since there are scores of books entitled “Stochastic Processes”, and similar 
titles, but careful perusal of these soon shows that their aim does not coincide 
with mine. There are purely theoretical and highly mathematical books, there are 
books related to electrical engineering or communication theory, and there are 
books for biologists — many of them very good, but none of them covering the 
kind of applications that appear nowadays so frequently in Statistical Physics, 
Physical Chemistry, Quantum Optics and Electronics, and a host of other 
theoretical subjects that form part of the subject area of Synergetics, to which 
Series this book belongs. 

The main new point of view here is the amount of space which deals with 
methods of approximating problems, or transforming them for the purpose of 
approximating them. I am fully aware that many workers will not see their meth- 
ods here. But my criterion here has been whether an approximation 1s systematic. 
Many approximations are based on unjustifiable or uncontrollable assumptions, 
and are justified @ posteriori. Such approximations are not the subject of a 
systematic book — at least, not until they are properly formulated, and their 
range of validity controlled. In some cases I have been able to put certain 
approximations on a systematic basis, and they appear here — in other cases I 
have not. Others have been excluded on the grounds of space and time, and 
I presume there will even be some that have simply escaped my attention. 

A word on the background assumed. The reader must have a good knowledge 
of practical calculus including contour integration, matrix algebra, differential 
equations, both ordinary and partial, at the level expected of a first degree in 
applied mathematics, physics or theoretical chemistry. This is not a text book for 
a particular course, though it includes matter that has been used in the University 
of Waikato in a graduate course in physics. It contains material which I would 
expect any student completing a doctorate in our quantum optics and stochastic 
processes theory group to be familiar with. There is thus a certain bias towards 
my own interests, which is the prerogative of an author. 

I expect the readership to consist mainly of theoretical physicists and 
chemists, and thus the general standard is that of these people. This is not a rigor- 
ous book in the mathematical sense, but it contains results, all of which I am con- 
fident are provable rigorously, and whose proofs can be developed out of the 
demonstrations given. 


Preface to the First Edition IX 


The organisation of the book is as in the following table, and might raise some 
eyebrows. For, after introducing the general properties of Markov processes, I 
have chosen to base the treatment on the conceptually difficult but intuitively 
appealing concept of the stochastic differential equation. I do this because of my 
own experience of the simplicity of stochastic differential equation methods, once 
one has become familiar with the Ito calculus, which I have presented in Chapter 4 
in a rather straightforward manner, such as I have not seen in any previous text. It 
is true that there is nothing in a stochastic differential equation that 1s not ina 
Fokker-Planck equation, but the stochastic differential equation is so much easier 
to write down and manipulate that only an excessively zealous purist would try to 
eschew the technique. On the other hand, only similar purists of an opposing camp 
would try to develop the theory without the Fokker-Planck equation, so Chapter 5 
introduces this as a complementary and sometimes overlapping method of 
handling the same problem. Chapter 6 completes what may be regarded as the 
“central core” of the book with a treatment of the two main analytical approxima- 
tion techniques: small noise expansions and adiabatic elimination. 

The remainder of the book is built around this core, since very many methods 
of treating the jump processes in Chapter 7 and the spatially distributed systems, 
themselves best treated as jump processes, depend on reducing the system to an 
approximating diffusion process. Thus, although /ogically the concept of a jump 
process 1s much simpler than that of a diffusion process, analytically, and in 
terms of computational methods, the reverse is true. 

Chapter 9 is included because of the practical importance of bistability and, 
as indicated, it is almost independent of all but the first five chapters. Again, I 
have included only systematic methods, for there is a host of ad hoc methods in 
this field. 

Chapter 10 requires some knowledge of quantum mechanics. I hope it will be 
of interest to mathematicians who study stochastic processes because there is still 
much to be done in this field which is of great practical importance and which 
naturally introduces new realms in stochastic processes — in particular, the 
rather fascinating field of stochastic processes in the complex plane which turn 
up as the only way of reducing quantum processes to ordinary stochastic proc- 
esses. It is with some disappointment that I have noted a tendency among mathe- 
maticians to look the other way when quantum Markov processes are mentioned, 
for there is much to be done here. For example, I know of no treatment of escape 
problems in quantum Markov systems. 

It is as well to give some idea of what is not here. I deal entirely with Markov 
processes, or systems that can be embedded in Markov processes. This means 
that no work on non-linear Markovian stochastic differential equations has been 
included, which I regret. However, van Kampen has covered this field rather 
well, and it is now well covered in his book on stochastic processes. 

Other subjects have been omitted because I feel that they are not yet ready for 
a definitive formulation. For example, the theory of adiabatic elimination in 
spatially distributed systems, the theory of fluctuating hydrodynamics, renor- 
malisation group methods in stochastic differential equations, and associated 
critical phenomena. There is a great body of literature on all of these, and a 
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Preface to the First Edition XI 


Further, for the sake of compactness and simplicity I have normally present- 
ed only one way of formulating certain methods. For example, there are several 
different ways of formulating the adiabatic elimination results, though few have 
been used in this context. My formulation of quantum Markov processes and the 
use of P-representations is only one of many. To have given a survey of all 
formulations would have required an enormous and almost unreadable book. 
However, where appropriate I have included specific references, and further 
relevant matter can be found in the general bibliography. 


Hamilton, New Zealand C. W. Gardiner 
January, 1983 
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1. A Historical Introduction 


1.1 Motivation 


Theoretical science up to the end of the nineteenth century can be viewed as the 
study of solutions of differential equations and the modelling of natural phenomen: 
by deterministic solutions of these differential equations. It was at that time 
commonly thought that if all initial data could only be collected, one would b: 
able to predict the future with certainty. 

We now know this is not So, in at least two ways. Firstly, the advent of quantun 
mechanics within a quarter of a century gave rise to a new physics, and hence ¢ 
new theoretical basis for all science, which had as an essential basis a purel: 
statistical element. Secondly, more recently, the concept of chaos has arisen, i 
which even quite simple differential equation systems have the rather alarmin; 
property of giving rise to essentially unpredictable behaviour. To be sure, one cat 
predict the future of such a system given its initial conditions, but any error in th 
initial conditions is so rapidly magnified that no practical predictability is left 
In fact, the existence of chaos is really not surprising, since it agrees with more o 
our everyday experience than does pure predictability—but it is surprising perhap 
that it has taken so long for the point to be made. 


Number of molecules 
NO 


z 


Fig. 1.1. Stochastic simulation of an isomerisation reaction X ——— A 
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Chaos and quantum mechanics are not the subject of this chapter. Here I wish 
to give a semihistorical outline of how a phenomenological theory of fluctuating 
phenomena arose and what its essential points are. The very usefulness of predic- 
table models indicates that life is not entirely chaos. But there is a limit to predic- 
tability, and what we shall be most concerned with in this book are models of 
limited predictability. The experience of careful measurements in science normally 
gives us data like that of Fig. 1.1, representing the growth of the number of mole- 
cules of a substance XY formed by a chemical reaction of the form X¥ = A. A quite 
well defined deterministic motion is evident, and this is reproducible, unlike the 
fluctuations around this motion, which are not. 


1.2. Some Historical Examples 


1.2.1 Brownian Motion 


The observation that, when suspended in water, small pollen grains are found to 
be in a very animated and irregular state of motion, was first systematically 
investigated by Robert Brown in 1827, and the observed phenomenon took the 
name Brownian Motion because of his fundamental pioneering work. Brown was 
a botanist—indeed a very famous botanist—and of course tested whether this 
motion was in some way a manifestation of life. By showing that the motion was 
present in any suspension of fine particles—glass, minerals and even a fragment of 
the sphinx—he ruled out any specifically organic origin of this motion. The motion 
is illustrated in Fig. 1.2. me 


Fig. 1.2. Motion of a point undergoing Brownian 
motion 


The riddle of Brownian motion was not quickly solved, and a satisfactory 
explanation did not come until 1905, when Einstein published an explanation under 
the rather modest title “‘iiber die von der molekular-kinetischen Theorie der 
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Warme geforderte Bewegung von in ruhenden Fliiss!gkeiten suspendierten Teil- 
chen” (concerning the motion, as required by the molecular-kinetic theory of heat, 
of particles suspended in liquids at rest) [1.2]. The same explanation was indepen- 
dently developed by Smoluchowski [1.3], who was responsible for much of the later 
systematic development and for much of the experimental verification of Brownian 
motion theory. 

There were two major points in Einstein’s solution to the problem of Brownian 
motion. 


(i) The motion is caused by the exceedingly frequent impacts on the pollen grain of 
the incessantly moving molecules of liquid in which it is suspended. 

(ii) The motion gf these molecules is so complicated that its effect on the pollen 
grain can only be described probabilistically in terms of exceedingly frequent 
statistically independent impacts. 

The existence of fluctuations like these ones calls out for a statistical explanation 
of this kind of phenomenon. Statistics had already been used by Maxwell and 
Boltzmann in their famous gas theories, but only as a description of possible states 
and the likelihood of their achievement and not as-an intrinsic part of the time 
evolution of the system. Rayleigh [1.1] was in fact the first to consider a statistical 
description in this context, but for one reason or another, very little arose out of 
his work. For practical purposes, Einstein’s explanation of the nature of Brownian 
motion must be regarded as the beginning of stochastic modelling of natural 
phenomena. 

Einstein’s reasoning is very clear and elegant. It contains all the basic concepts 
which will make up the subject matter of this book. Rather than paraphrase a classic 
piece of work, I shall simply give an extended excerpt from Einstein’s paper (author’s 
translation): 

“It must clearly be assumed that each individual particle executes a motion 
which is independent of the motions of all other particles; it will also be considered 
that the movements of one and the same particle in different time intervals are 
independent processes, as long as these time intervals are not chosen too small 

‘“‘We introduce a time interval t into consideration, which is very small com- 
pared to the observable time intervals, but nevertheless so large that in two succes- 
sive time intervals t, the motions executed by the particle can be thought of as 
events which are independent of each other. 

““Now let there be a total of n particles suspended in a liquid. In a time interva 
t, the X-coordinates of the individual particles will increase by an amount 4, where 
for each particle 4 has a different (positive or negative) value. There will be 4 
certain frequency law for 4; the number dn of the particles which experience < 
shift which is between 4 and 4 + dd will be expressible by an equation of the form 


dn = n¢(A)d4, (1.2.1 


where 


f g(4)dA = | (1.2.2 


4 1. A Historical Introduction 


and ¢ is only different from zero for very small values of 4, and satisifes the condi- 
tion 


g(4) = ¢(—4). (1.2.3) 


“We now investigate how the diffusion coefficient depends on g. We shall once 
more reStrict ourselves to the case where the number » of particles per unit volume 
depends only on x and ¢. 

‘Let v = f(x, t) be the number of particles per unit volume. We compute the 
distribution of particles at the time ¢ + 7 from the distribution at time ¢. From the 
definition of the function g(4), it is easy to find the number of particles which at 
time ¢ + t are found between two planes perpendicular to the x-axis and passing 
through points x and x + dx. One obtains 


fx, t+ ddx = dx f fe + 4, Ng(Add . (1.2.4) 
But since T is very small, we can set 
of 
{t+ D=f%t)+ ta. (1.2.5) 


Furthermore, we develop f(x + 4, t) in powers of 4: 


of(x, t¥ , A? 07f(x, t) 
Ox 2! Ox? 


f(x + A, t) = flx, t) + 4 ss aoe, (1.2.6) 


a 


We can use this series under the integral, because only small values of 4 contribute 
to this equation. We obtain 


f+ cf t=f S g(4)dd + >~ of aa Ag(A)d4 + ae J a (AAA (1.2.7) 


Because (x) = ¢(—x), the second, fourth, etc., terms on the right-hand side vanish, 
while out of the Ist, 3rd, Sth, etc., terms, each one Is very small compared with the 
previous. We obtain from this equation, by taking into consideration 


J d(4)d4 = 1 (1.2.8) 
and setting 
i f A” s(A)ds =) (1.2.9) 
C2 ' 
and keeping only the Ist and third terms of the right-hand side, 


O 2 
pi... 0.210 
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This is already known as the differential equation of diffusion and it can be seen that 
D is the diffusion coefficient. ... 

‘The problem, which corresponds to the problem of diffusion from a single 
point (neglecting the interaction between the diffusing particles), is now com- 
pletely determined mathematically: its solution is 


n e7*2/4Dt 


fe.) = ae (1.2.11) 


‘““We now calculate, with the help of this equation, the displacement J, in the 
direction of the X-axis that a particle experiences on the average or, more exactly, 
the square root of the arithmetic mean of the square of the displacement in the 
direction of the X-axis; it is 


1, = f/f =/2Dt.” (1.2.12) 


Einstein’s derivation is really based on a discrete time assumption, that impacts 
happen only at times 0, t, 27, 3t ..., and his resulting equation (1.2.10) for the 
distribution function f(x, ¢) and its solution (1.2.11) are to be regarded as approxi- 
mations, in which Tt is considered so small that t may be considered as being 
continuous. Nevertheless, his description contains very many of the major concepts 
which have been developed more and more generally and rigorously since then, 
and which will be central to this book. For example: 


1) The Chapman-Kolmogorov Equation occurs as Einstein’s equation (1.2.4). It 
states that the probability of the particle being at point x at time ¢ + T is given by 
the sum of the probability of all possible “‘pushes’’ 4 from positions x + 4, multi- 
plied by the probability of being at x + 4 at time ¢. This assumption is based on 
the independence of the push 4 of any previous history of the motion: it is only 
necessary to know the initial position of the particle at time t—not at any previous 
time. This is the Markov postulate and the Chapman Kolmogorov equation, of 
which (1.2.4) is a special form, is the central dynamical equation to all Markov 
processes. These will be studied in detail in Chap. 3. 


11) The Fokker-Planck Equation: Eq. (1.2.10) is the diffusion equation, a special case 
of the Fokker-Planck equation, which describes a large class of very interesting 
stochastic processes in which the system has a continuous sample path. In this case, 
that means that the pollen grain’s position, if thought of as obeying a probabilistic 
law given by solving the diffusion equation (1.2.10), in which time ¢ is continuous 
(not discrete, as assumed by Einstein), can be written x(t), where x(t) is a continuous 
function of time-but a random function. This leads us to consider the possibility of 
describing the dynamics of the system in some direct probabilistic way, so that we 
would have a random or stochastic differential equation for the path. This procedure 
was initiated by Langevin with the famous equation that to this day bears his name. 
We will discuss this in detail in Chap. 4. 


ii) The Kramers-Moyal and similar expansions are essentially the same as that 
used by Einstein to go from (1.2.4) (the Chapman-Kolmogorov equation) to the 
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diffusion equation (1.2.10). The use of this type of approximation, which effectively 
replaces a process whose sample paths need not be continuous with one whose 
paths are continuous, has been a topic of discussion in the last decade. Its use 
and validity will be discussed in Chap. 7. 


1.2.2 Langevin’s Equation 


Some time after Einstein’s original derivation, Langevin [1.4] presented a new 
method which was quite different from Einstein’s and, according to him, “‘infinitely 
more simple.”’ His reasoning was as follows. 

From statistical mechanics, it was known that the mean kinetic energy of the 
Brownian particle should, in equilibrium, reach a value 


(imv®) = 4kT (1.2.13) 


(T; absolute temperature, k; Boltzmann’s constant). (Both Einstein and Smolucho- 
wski had used this fact). Acting on the particle, of mass m there should be two 
forces: 


i) a viscous drag: assuming this is given by the same formula as in macroscopic 
hydrodynamics, this is —6nya dx/dt, n being the viscosity and a the diameter of 
the particle, assumed spherical. 


ii) another fluctuating force X which represents the incessant impacts of the 
molecules of the liquid on the Bréwnian particle. All that 1s known about it is that 
fact, and that it should be positive and negative with equal probability. Thus, the 
equation of motion for the position of the particle is given by Newton’s law as 


2 
mo = —6nqa + X (1.2.14) 


and multiplying by x, this can be written 


md. , i d(x?) 
5 qa ) — mv* = —3nna oe” + Xx, (1.2.15) 


where v = dx/dt. We now average over a large number of different particles and use 
(1.2.13) to obtain an equation for <x’): 


m d*<x*) 12 > 

> yt + 3nya “+ = kT, (1.2.16) 
where the term (xX) has been set equal to zero because (to quote Langevin) “‘of 
the irregularity of the quantity X”’’. One then finds the general solution 


ie, = kT/(3nna) + C exp (—6nnat/m) , (1.2.17) 
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where C is an arbitrary constant. Langevin estimated that the decaying exponential 
approaches zero with a time constant of the order of 10~* s, which for any practical 
observation at that time, was essentially immediately. Thus, for practical purposes, 
we can neglect this term and integrate once more to get 


Kx*) — (x2) = [kT/(3nna)]t . (1.2.18) 
This corresponds to (1.2.12) as deduced by Einstein, provided we identify 
D = kT/(6nya) , (1.2.19) 


a result which Einstein derived in the same paper but by independent means. 

Langevin’s eqtfation was the first example of the stochastic differential equation— 
a differential equation with a random term X and hence whose solution is, in some 
sense, a random function. Each solution of Langevin’s equation represents a 
different random trajectory and, using only rather simple properties of X (his 
fluctuating force), measurable results can be derived. 

One question arises: Einstein explicitly required that (on a sufficiently large time 
scale) the change 4 be completely independent of the preceding value of 4. Lange- 
vin did not mention such a concept explicitly, but it is there, implicitly, when one 
sets (Xx) equal to zero. The concept that X is extremely irregular and (which is not 
mentioned by Langevin, but is implicit) that X and x are independent of each 
other—that the irregularities in x as a function of time, do not somehow conspire 
to be always in the same direction as those of X, so that the product could possibly 
not be set equal to zero; these are really equivalent to Einstein’s independence 
assumption. The method of Langevin equations is clearly very much more direct, 
at least at first glance, and gives a very natural way of generalising a dynamical 
equation to a probabilistic equation. An adequate mathematical grounding for 
the approach of Langevin, however, was not available until more than 40 years 
later, when Ito formulated his concepts of stochastic differential equations. And 
in this formulation, a precise statement of the independence of X and x led to the 
calculus of stochastic differentials, which now bears his name. and which will be 
fully developed in Chap. 4. 

As a physical subject, Brownian motion had its heyday in the first two decades 
of this century, when Smoluchowski in particular, and many others carried out 
extensive theoretical and experimental investigations, which showed complete agree- 
ment with the original formulation of the subject as initiated by himself and 
Einstein, see [1.5]. More recently, with the development of laser light scattering 
spectroscopy, Brownian motion has become very much more quantitatively 
measurable. The technique is to shine intense, coherent laser light into a small 
volume of liquid containing Brownian particles, and to study the fluctuations in the 
intensity of the scattered light, which are directly related to the motions of the 
Brownian particles. By these means it is possible to observe Brownian motion of 
much smaller particles than the traditional pollen, and to derive useful data about 
the sizes of viruses and macromolecules. With the preparation of more concentrated 
suspensions, interactions between the particles appear, generating interesting and 
quite complex nroblems related ta macramaleciular cnenanciane on ARENAS cS 

& 
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The general concept of fluctuations describable by such equations has developed 
very extensively in a very wide range of situations. The advantages of a continuous 
description turn out to be very significant, since only a very few parameters are 
required, i.e., essentially the coefficients of the derivatives in (1.2.7): 


[ 46(4)dd, and f A>9(A)dd. (1.2.20) 


It 1s rare to find a problem which cannot be specified, in at least some degree of 
approximation, by such a system, and for qualitative simple analysis of problems it 
is normally quite sufficient to consider an appropriate Fokker-Planck equation, of 
a form obtained by allowing both coefficients (1.2.20) to depend on x, and in a space 
of an appropriate number of dimensions. 


1.3. Birth-Death Processes 


A wide variety of phenomena can be modelled by a particular class of process called 
a birth-death process. The name obviously stems from the modelling of human or 
animal populations in which individuals are born, or die. One of the most entertain- 
ing models is that of the prey-predator system consisting of two kinds of animal, 
one of which preys on the other, which is itself supplied with an inexhaustible food 
supply. Thus letting X symbolise the prey, Y the predator, and A the food of the 
prey, the process under consideration might be 


X+A—-2X (1.3.1a) 
X+ Y—-2Y (1.3.1b) 
Y—B (1.3.1c) 


which have the following naive, but charming interpretation. The first equation 
symbolises the prey eating one unit of food, and reproducing immediately. The 
second equation symbolises a predator consuming a prey (who thereby dies—this 
is the only death mechanism considered for the prey) and immediately reproducing. 
The final equation symbolises the death of the predator by natural causes. It is easy 
to guess model differential equations for x and y, the numbers of X and Y. One 
might assume that the first reaction symbolises a rate of production of X propor- 
tional to the product of x and the amount of food; the second equation a produc- 
tion of Y (and an equal rate of consumption of X) proportional to xy, and the last 
equation a death rate of Y, in which the rate of death of Y is simply proportional to 
y; thus we might write 


a = k,ax — k,xy (1.3.2a) 
d 
a — kyxy — ky. (1.3.2b) 
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The solutions of these equations, which were independently developed by Lotka 
[1.7] and Volterra [1.8] have very interesting oscillating solutions, as presented in 
Fig. 1.3a. These oscillations are qualitatively easily explicable. In the absence of 
significant numbers of predators, the prey population grows rapidly until the 
presence of so much prey for the predators to eat stimulates their rapid reproduction, 
at the same time reducing the number of prey which get eaten. Because a large 
number of prey have been eaten, there are no longer enough to maintain the 
population of predators, which then die out, returning us to our initial situation. 
The cycles repeat indefinitely and are indeed, at least qualitatively, a feature of 
many real prey-predator systems. An example is given in Fig. 1.3b. 


1500 
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Fig. 1.3a-c. Time development in prey-predator systems. (a) Plot of solutions of the deterministic 
equations (1.3.2) (x = solid line, y = dashed line). (b) Data for a real prey-predator system. Here 
the predator is a mite (Eotetranychus sexmaculatus—dashed line) which feeds on oranges, and 


the prey is another mite (Typhlodromus occidentalis). Data from [1.16, 17]. (c) Simulation of 
stochastic equations (1.3.3) 
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Of course, the realistic systems do not follow the solutions of differential 
equations exactly—they fluctuate about such curves. One must include these 
fluctuations and the simplest way to do this is by means of a birth-death master 
equation. We assume a probability distribution, P(x, y, t), for the number of indi- 
viduals at a given time and ask for a probabilistic law corresponding to (1.3.2). 
This is done by assuming that in an infinitesimal time At, the following transition 
probability laws holds. 


Prob (x — x+1; y— y) = k,axdAt, (1.3.3a) 
Prob (x — x—1; y— y+ 1) = k,xyAt, (1.3.3b) 
Prob (x — x; y— y—1) = ky yAt, (1.3.3c) 
Prob (x — x; y— y) = 1—(k,ax + kyxy + kyy)At. (1.3.3d) 


Thus, we simply, for example, replace the simple rate laws by probability laws. 
We then employ what amounts to the same equation as Einstein and others used, 
i.e., the Chapman-Kolmogorov equation, namely, we write the probability at 
t + At as a sum of terms, each of which represents the probability of a previous 
State multiplied by the probability of a transition to the state (x, y). Thus, we 
find 


OSE SAO) ee pel Dee Oe) 


At 
¢ 
x Pix+1,y—1,t) + ks(y + IPO, y + 1, t) — (kiax + koxy + kay) 
x P(x, y, t) ° (1.3.4) 


and letting At — 0, = dP(x, y, t)/dt. In writing the assumed probability laws 
(1.3.3), we are assuming that the probability of each of the events occurring 
can be determined simply from the knowledge of x and y. This is again the 
Markov postulate which we mentioned in Sect. 1.2.1. In the case of Brownian 
motion, very convincing arguments can be made in favour of this Markov assump- 
tion. Here it is by no means clear. The concept of heredity, 1.e., that the behaviour 
of progeny is related to that of parents, clearly contradicts this assumption. How 
to include heredity is another matter; by no means does a unique prescription 
exist. 

The assumption of the Markov postulate in this context is valid to the extent 
that different individuals of the same species are similar; it 1s invalid to the extent 
that, nevertheless, perceptible inheritable differences do exist. 

This type of model has a wide application—in fact to any system to which a 
population of indivuduals may be attributed, for example systems of molecules of 
various chemical compounds, of electrons, of photons and similar physical parti- 
cles as well as biological systems. The particular choice of transition probabilities 
is made on various grounds determined by the degree to which details of the 
births and deaths involved are known. The simple multiplicative laws, as illustrated 
in (1.3.3), are the most elementary choice, ignoring, as they do, almost all details of 
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the processes involved. In some of the physical processes we can derive the transi- 
tion probabilities in much greater detail and with greater precision. 

Equation (1.3.4) has no simple solution, but one major property differentiates 
equations like it from an equation of Langevin’s type, in which the fluctuation term 
is simply added to the differential equation. Solutions of (1.3.4) determine both 
the gross deterministic motion and the fluctuations; the fluctuations are typically 
of the same order of magnitude as the square roots of the numbers of individuals 
involved. It is not difficult to simulate a sample time development of the process 
as in Fig. 1.3c. The figure does show the correct general features, but the model is 
so obviously simplified that exact agreement can never be expected. Thus, in 
contrast to the situation in Brownian motion, we are not dealing here so much 
with a theory of a phenomenon, as with a class of mathematical models, which 
are simple enough to have a very wide range of approximate validity. We will see 
in Chap. 7 that a theory can be developed which can deal with a wide range of 
models in this category, and that there is indeed a close connection between this kind 
of theory and that of stochastic differential equations. 


1.4 Noise in Electronic Systems 


The early days of radio with low transmission powers and primitive receivers, 
made it evident to every ear that there were a great number of highly irregular 
electrical signals which occurred either in the atmosphere, the receiver, or the 
radio transmitter, and which were given the collective name of “‘noise’’, since this is 
certainly what they sounded like on a radio. Two principal sources of noise are 
shot noise and Johnson noise. 


1.4.1 Shot Noise 


In a vacuum tube (and in solid-state devices) we get a nonsteady electrical current, 
since it is generated by individual electrons, which are accelerated across a distance 
and deposit their charge one at a time on the anode. The electric current arising 
from such a process can be written 


I(t) = DFG te), (1.4.1) 


ty 


where F(t-t,) represents the contribution to the current of an electron which arrives 
at time t,. Each electron is therefore assumed to give rise to the same shaped pulse, 
but with an appropriate delay, as in Fig. 1.4. 

A statistical aspect arises immediately we consider what kind of choice must be 
made for t,. The simplest choice is that each electron arrives independently of the 
previous one—that is, the times ¢, are randomly distributed with a certain average 
number per unit time in the range (— 00, oo), or whatever time is under considera- 
tion. 
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Fig. 1.4. Illustration of shot 
noise: identical electric pulses 
arrive at random times 


Pulse height 


AIR 


Time 


The analysis of such noise was developed during the 1920’s and 1930’s and was 
summarised and largely completed by Rice [1.9]. It was first considered as early as 
1918 by Schottky [1.10]. 

We shall find that there is a close connection between shot noise and processes 
described by birth-death master equations. For, if we consider n, the number of 
electrons which have arrived up to a time ¢, to be a Statistical quantity described by 
a probability P(n, t), then the assumption that the electrons arrive independently is 
clearly the Markov assumption. Then, assuming the probability that an electron 
will arrive in the time interval between ¢ and t + At is completely independent of ¢ 
and n, its only dependence can be on Ar. By choosing an appropriate constant A, we 
may write 


Prob (n —-n + 1, in time At) = AAf (1.4.2) 
so that $ 
P(n,t + At) = P(n,t)(1 — AAD + P(n—1, Haat ~ (1.4.3) 


and taking the limit At — 0 


orn) = A[P(n—1, t) — P(n, t)] (1.4.4) 


which is a pure birth process. By writing 
G(s, t) = >js"P(n, t) (1.4.5) 


[here, G(s, t) is known as the generating function for P(n, t), and the particular tech- 
nique of solving (1.28) is very widely used], we find 


PMD) 1 s— 1G; 1) (1.4.6) 


so that 
G(s, t) = exp [A(s—1)t]G(s, 0). (1.4.7) 


By requiring at time ¢ = 0 that no electrons had arrived, it is clear that P(O, 0) is 
1 and P(n, 0) is zero for allm > 1, so that G(s, 0) = 1. Expanding the solution (1.4.7) 
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in powers of s, we find 
P(n, t) = exp (—At) (At)"/n! (1.4.8) 


which 1s known as a Poisson distribution (Sect. 2.8.3). Let us introduce the variable 
N(t), which is to be considered as the number of electrons which have arrived up to 
time ¢, and is a random quantity. Then, 


P(n, t) = Prob {N(t) = 7}, (1.4.9) 


and N(t) can be called a Poisson process variable. Then clearly, the quantity s(t), 
formally defined by 


u(t) = dN(t)/dt . (1.4.10) 


is zero, except when M(t) increases by 1; at that stage it is a Dirac delta function, 
1.€., 


w(t) = 2, Olt — ty), (1.4.11) 


where the ¢, are the times of arrival of the individual electrons. We may write 
I(t) = f dt’F(t— t’)u(t’). (1.4.12) 


A very reasonable restriction on F(t — t’) is that it vanishes if ¢t < t’, and that 
for tf — oo, it also vanishes. This simply means that no current arises from an 
electron before it arrives, and that the effect of its arrival eventually dies out. We 
assume then, for simplicity, the very commonly encountered form 


F(t) = qe-* (t > 0) 
=0 (t <0) (1.4.13) 


so that (1.4.12) can be rewritten as 
f ater AN(t") 
= f a(r—e) TON 
I(t) J dt'qge WW (1.4.14) 


We can derive a simple differential equation. We differentiate J(t) to obtain 


dI(t) —ati—tl dN(t’) t ; siiag dN(t’) 

“dt E ae dt |... a J dt'(—aq)e~*‘ a (1.4.15) 
so that 

ae = —al(t) + qu(t). (1.4.16) 
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This is a kind of stochastic differential equation, similar to Langevin’s equation, in 
which, however, the fluctuating force is given by qu(t), where y(t) is the derivative 
of the Poisson process, as given by (1.4.11). However, the mean of p(t) is nonzero, 
in fact, from (1.4.10) 


(u(t)dt> = <dN(t)> = Adt (1.4.17) 
<[dN(t) — Adt)?> = Adt (1.4.18) 


from the properties of the Poisson distribution, for which the variance equals the 
mean. Defining, then, the fluctuation as the difference between the mean value 
and dN(t), we write 


dn(t) = dN(t) — Adt, (1.4.19) 
so that the stochastic differential equation (1.4.16) takes the form 

dI(t) = [Aq — al(t)] dt + qdn(t). (1.4.20) 
Now how does one solve such an equation? In this case, we have an academic prob- 
lem anyway since the solution is known, but one would like to have a technique. 
Suppose we try to follow the method used by Langevin—what will we get as an 


answer? The short reply to this question is: nonsense. For example, using ordinary 
calculus and assuming </(t)dy(t) = 0, we can derive 


ertoe = Aq — adI(t) and (1.4.21) 
5 A = agen — ad) (1.4.22) 


solving in the limit t —- co, where the mean values would reasonably be expected 
to be constant one finds 


(I(co)>) = Aq/a — and (1.4.23) 
(P?(co)> = (Aq/a)’ . (1.4.24) 


The first answer is reasonable—it merely gives the average current through the 
system in a reasonable equation, but the second implies that the mean square 
current 1s the same as the square of the mean, I.e., the current at t —- co does not 
fluctuate! This 1s rather unreasonable, and the solution to the problem will show 
that stochastic differential equations are rather more subtle than we have so far 
presented. 

Firstly, the notation in terms of differentials used in (1.4.17—20) has been chosen 
deliberately. In deriving (1.4.22), one uses ordinary clalculus, i.e., one writes 
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d(?) = (I + dI)* — I? = 2IdI + (dl (1.4.25) 


and then one drops the (d/)* as being of second order in d/. But now look at (1.4.18): 
this 1s equivalent to 


<dn(t)) = Adt (1.4.26) 


so that a quantity of second order in dn is actually of first order in dt. The reason 
is not difficult to find. Clearly, 


dn(t) = dN(t) — Adt, (1.4.27) 


but the curve of M(f) is a step function, discontinuous, and certainly not differen- 
tiable, at the times of arrival of the individual electrons. In the ordinary sense, 
none of these calculus manipulations is permissible. But we can make sense out of 
them as follows. Let us simply calculate <d(7”)> using (1.4.20, 25, 26): 


(d(I)*> = 21 {lg — al dt + 4 dn(t)} > 
+ <{[Aq — al]dt + qdn(t)}*» . (1.4.28) 


We now assume again that </(t)dy(t)>) = 0 and expand, after taking averages 
using the fact that ¢dy(t)*) = A dt, to Ist order in dt. We obtain 


and this gives 


2 
{I?(co)) — CI(co)>* = rt (1.4.30) 
Thus, there are fluctuations from this point of view, as t — oo. The extra term in 
(1.4.29) as compared to (1.4.22) arises directly out of the statistical considerations 
implicit in M(t) being a discontinuous random function. 

Thus we have discovered a somewhat deeper way of looking at Langevin’s kind 
of equation—the treatment of which, from this point of view, now seems extremely 
naive. In Langevin’s method the fluctuating force X is not specified, but it will 
become clear in this book that problems such as we have just considered are very 
widespread in this subject. The moral is that random functions cannot normally 
be differentiated according to the usual laws of calculus: special rules have to be 
developed, and a precise specification of what one means by differentiation becomes 
important. We will specify these problems and their solutions in Chap. 4 which will 
concern itself with situations in which the fluctuations are Gaussian. 


1.4.2 Autocorrelation Functions and Spectra 


The measurements which one can carry out on fluctuating systems such as electric 
circuits are, in practice, not of unlimited variety. So far, we have considered the 
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distribution functions, which tell us, at any time, what the probability distribution 
of the values of a stochastic quantity are. If we are considering a measureable quan- 
tity x(t) which fluctuates with time, in practice we can sometimes determine the 
distribution of the values of x, though more usually, what is available at one time 
are the mean x(t) and the variance var {x(t).} 

The mean and the variance do not tell a great deal about the underlying dyna- 
mics of what is happening. What would be of interest is some quantity which is a 
is a measure of the influence of a value of x at time ¢ on the value at time ¢ + tT. 
Such a quantity 1s the autocorrelation function, which was apparently first introduced 
by Taylor [1.11] as 


G(r) = lim + f dt x(t)x(t + 2). (1.4.31) 


This is the time average of a two-time product over an arbitrary large time 7, 
which is then allowed to become infinite. 

Nowadays purpose built autocorrelators exist, which sample data and directly 
construct the autocorrelation function of a desired process, from laser light 
scattering signals to bacterial counts. It is also possible to construct autocorrelation 
programs for high speed on line experimental computers. Further, for very fast 
systems, there are clipped autocorrelators, which measure an approximation to 
the autocorrelation function given by defining a variable c(t) such that 


c(t) =0 x(t)</ 4 
= | x(t) >1 a (1.4.32) 


and computing the autocorrelation function of that variable. 
A more traditional approach is to compute the spectrum of the quantity x(t). 
This is defined in two stages. First, define 


T 
y(w) = | dt e'”x(t) (1.4.33) 
8) 
then the spectrum is defined by 
— lim — 2 
S(@) = lim aT | y(@)|?. (1.4.34) 


The autocorrelation function and the spectrum are closely connected. By a little 
manipulation one finds 


S(@) = lim E {cos (wt) dt # f ‘x(t)x(s ~- ot | (1.4.35) 


and taking the limit 7 — co (under suitable assumptions to ensure the validity of 
certain interchanges of order), one finds 
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S(w) = — J cos (wt)G(t)dt . (1.4.36) 
0 
This is a fundamental result which relates the Fourier transform of the autocorrela- 


tion function to the spectrum. The result may be put in a slightly different form 
when one notices that 


T-T 
(in zl dt x(t +1)x(t) = G(r) (1.4.37) 
Too ft 1, 
so we obtain 


3 


S(o) = - i eitG(r)dt (1.4.38) 


with the corresponding inverse 


G(t) = [ &S(w)do. (1.4.39) 


This result is known as the Wiener-Khinchin theorem [1.12,13] and has widespread 
application. 

It means that one may either directly measure the autocorrelation function of a 
signal, or the spectrum, and convert back and forth, which by means of the fast 
Fourier transform and computer is relatively straightforward. 


1.4.3 Fourier Analysis of Fluctuating Functions: Stationary Systems 


The autocorrelation function has been defined so far as a time average of a signal, 
but we may also consider the ensemble average, in which we repeat the same mea- 
surement many times, and compute averages, denoted by ¢_ >. It will be shown 
that for very many systems, the time average is equal to the ensemble average; 
such systems are termed ergodic (Sect. 3.7.1). 

If we have such a fluctuating quantity x(t), then we can consider the average 


(x(t)x(t + t)) = G(r), (1.4.40) 


this result being the consequence of our ergodic assumption. 
Now it is very natural to write a Fourier transform for the stochastic quantity 


x(t) 
x(t) = { dw c(w) ei” (1.4.41) 


and consequently, 


e(w) = st { dt x(t) e“™, (1.4.42) 
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Note that x(t) real implies 
c(@) = c*(—@). (1.4.43) 


If the system is ergodic, we must have a constant <x(t))>, since the time average 
is clearly constant. The process is then stationary by which we mean that all time- 
dependent averages are functions only of time differences, i.e., averages of functions 
x(t,), x(t), ... x(t,) are equal to those of x(t; + 4), x(t2 + 4), ... X(tn + 4). 

For convenience, in what follows we assume <x> = 0. Hence, 


<c(w)> = oo f dt <xyei** = 0 (1.4.44) 


<c(a)c*(o')> = ai ff dt dt'eiotio’ (x(t)x(t’)) 
a a 5(@ — ow’) f dt e*G(z) 


= 6(w — o’)S(a) . (1.4.45) 


Here we find not only a relationship between the mean square <|c(@)|”) and the 
spectrum, but also the result that stationarity alone implies that c(@) and c*(w’) 
are uncorrelated, since the term 5(@ — w’) arises because <x(t)x(t')> is a function 
only of t — t’. ° 


1.4.4 Johnson Noise and Nyquist’s Theorem 


Two brief and elegant papers appeared in 1928 in which Johnson [1.14] demonst- 
rated experimentally that an electric resistor automatically generated fluctuations 
of electric voltage, and Nyquist [1.15] demonstrated its theoretical derivation, in 
complete accordance with Johnson’s experiment. The principle involved was 
already known by Schottky [1.10] and is the same as that used by Einstein and 
Langevin. This principle is that of thermal equilibrium. If a resistor R produces 
electric fluctuations, these will produce a current which will generate heat. The heat 
produced in the resistor must exactly balance the energy taken out of the fluctua- 
tions. The detailed working out of this principle is not the subject of this section, 
but we will find that such results are common throughout the physics and che- 
mistry of stochastic processes, where the principles of statistical mechanics, whose 
basis is not essentially stochastic, are brought in to complement those of stochastic 
processes. The experimental result found was the following. We have an electric 
resistor of resistance R at absolute temperature 7. Suppose by means of a suitable 
filter we measure E(w)dw, the voltage across the resistor with angular frequency in 
the range (w, w + dw). Then, if kK is Boltzmann’s constant, 
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{E(@)>) = RKT/n . (1.4.46) 


This result is known nowadays as Nyquist’s theorem. Johnson remarked. ‘The 
effect is one of the causes of what ts called ‘tube noise’ in vacuum tube amplifiers. 
Indeed, it is often by far the larger part of the ‘noise’ of a good amplifier.” 

Johnson noise is easily described by the formalism of the previous subsection. 
The mean noise voltage is zero across a resistor, and the system is arranged so that 
it is in a Steady state and Is expected to be well represented by a stationary process. 
Johnson’s quantity is, in practice, a limit of the kind (1.4.34) and may be summa- 
rised by saying that the voltage spectrum S(q@) is given by 


S(w) = RkT/n, (1.4.47) 


that is, the spectrum Its flat, 1.e., a constant function of @. In the case of light, the 
frequencies correspond to different colours of light. If we perceive light to be white, 
it is found that in practice all colours are present in equal proportions—the optical 
spectrum of white light 1s thus flat—at least within the visible range. In analogy, the 
term white noise 1s applied to a noise voltage (or any other fluctuating quantity) 
whose spectrum Is flat. 

White noise cannot actually exist. The simplest demonstration is to note that 
the mean power dissipated in the resistor in the frequency range (@,, @2) is given 
by 


{ do S(@)[R = kT(@,—@)/n (1.4.48) 


@} 


so that the total power dissipated in all frequencies is infinite! Nyquist realised this, 
and noted that, in practice, there would be quantum corrections which would, 
at room temperature, make the spectrum flat only up to 7 x 10!? Hz, which is not 
detectable in practice, in a radio situation. The actual power dissipated in the 
resistor would be somewhat less than infinite, 107'° W in fact! And in practice 
there are other limiting factors such as the inductance of the system, which would 
limit the spectrum to even lower frequencies. 

From the definition of the spectrum in terms of the autocorrelation function 
given in Sect. 1.4, we have 


E(t + t)E(t)) = G(2) (1.4.49) 
ze -- f deo e2R kT (1.4.50) 
— 2RKTA(2), (1.4.51) 


which implies that no matter how small the time difference t, E(t + t) and E(t) 
are not correlated. This is, of course, a direct result of the flatness of the spectrum. 
A typical model of S(@) that is almost flat is 


S(@) = RkT/[n(w? 124+ 1)] (1.4.52) 
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Fig. 1.5.a,b. Correlation Functions ( ) and corresponding spectra (------ ) for (a) short 
correlation time corresponding to an almost flat spectrum; (b) long correlation time, giving a 
quite rapidly decreasing spectrum 


This is flat provided w < t,"'. The Fourier transform can be explicitly evaluated 
in this case to give 


CE(t + t)E(t)> = (R kT/t-) exp (—1/Te) (1.4.53) 


so that the autocorrelation function vanishes only for t > t., which is called the 
correlation time of the fluctuating voltage. Thus, the delta function correlation 
function appears as an idealisation, only valid on a sufficiently long time scale. 

This is very reminiscent of Einstein’s assumption regarding Brownian motion 
and of the behaviour of Langevin’s fluctuating force. The idealised white noise will 
play a highly important role in this book but, in just the same way as the fluctuation 
term that arises in a stochastic differential equation is not the same as an ordinary 
differential, we will find that differential equations which include white noise as a 
driving term have to be handled with great care. Such equations arise very 
naturally in any fluctuating system and it is possible to arrange by means of Stratono- 
vich’s rules for ordinary calculus rules to apply, but at the cost of imprecise mathe- 
matical definition and some difficulties in stochastic manipulation. It turns out to 
be far better to abandon ordinary calculus and use the Jto calculus, which is not 
very different (it is, in fact, very similar to the calculus presented for shot noise) 
and to preserve tractable statistical properties. All these matters will be discussed 
thoroughly in Chap. 4. 

White noise, as we have noted above, does not exist as a physically realisable 
process and the rather singular behaviour it exhibits does not arise in any realisable 
context. It is, however, fundamental in a mathematical, and indeed in a physical 
sense, in that it is an idealisation of very many processes that do occur. The slightly 
strange rules which we will develop for the calculus of white noise are not really 
very difficult and are very much easier to handle than any method which always 
deals with a real noise. Furthermore, situations in which white noise is not a good 
approximation can very often be indirectly expressed quite simply in terms of 
white noise. In this sense, white noise is the starting point from which a wide range 
of stochastic descriptions can be derived, and is therefore fundamental to the 
subject of this book. 


2. Probability Concepts 


In the preceding chapter, we introduced probability notions without any definitions. 
In order to formulate essential concepts more precisely, it is necessary to have 
some more precise expression of these concepts. The intention of this chapter is to 
provide some background, and to present a number of essential results. It is not a 
thorough outline of mathematical probability, for which the reader is referred to 
standard mathematical texts such as those by Feller [2.1] and Papoulis [2.2]. 


2.1 Events, and Sets of Events 


It is convenient to use a notation which Is as general as possible in order to describe 
those occurrences to which we might wish to assign probabilities. For example, 
we may wish to talk about a situation in which there are 6.4 x 10!* molecules in a 
certain region of space; or a situation in which a Brownian particle is at a certain 
point x in space; or possibly there are 10 mice and 3 owls in a certain region of a 
forest. 

These occurrences are all examples of practical realisations of events. More 
abstractly, an event is simply a member of a certain space, which in the cases most 
practically occuring can be characterised by a vector of integers 


n = (n,, N,N; ...) (2.1.1) 
or a vector of real numbers 
x= (x1, X2, X3 sda (2.1.2) 


The dimension of the vector is arbitary. 
It is convenient to use the language of set theory, introduce the concept of a set 
of events, and use the notation 


wEA (2.1.3) 


to indicate that the event w is one of events contained in A. For example, one 
may consider the set A(25) of events in the ecological population in which there 
are no more than 25 animals present; clearly the event @ that there are 3 mice, a 
tiger, and no other animals present satisfies 


® & A(25). (2.1.4) 
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More significantly, suppose we define the set of events A(r, AV) that a molecule 
is within a volume element AV centred on a point r. In this case, the practical signi- 
ficance of working in terms of sets of events becomes clear, because we should nor- 
mally be able to determine whether or not a molecule is within a neighbourhood 
AV of r, but to determine whether the particle is exactly at r is impossible. Thus, if 
we define the event w(y) that the molecule is at point y, it makes sense to ask 
whether 


ay) & A(r, AV) (2.1.5) 


and to assign a certain probability to the set A(r, AV), which is to be interpreted as 
the probability of the occurrence of (2.1.5) 


2.2 Probabilities 


Most people have an intuitive conception of a probability, based on their own 
experience. However, a precise formulation of intuitive concepts is fraught with 
difficulties, and it has been found most convenient to axiomatise probability theory 
as an essentially abstract science, in which a probability measure P(A) is assigned 
to every set A, in the space of events, including 


the set of all events: Q (2.2.1) 
the set of noevents:@; $ (2.2.2) 


in order to define probability, we need our sets of events to form a closed system 
(known by mathematicians as a o-algebra) under the set theoretic operations of 
union and intersection. 


2.2.1 Probability Axioms 


We introduce the probability of A, P(A), as a function of A satisfying the following 
probability axioms: 


(i) P(A) =O forall A, (2.2.3) 
(ii) P(Q) = 1, (2.2.4) 


(iii) if A, (=1, 2, 3, ...) is a countable (but possibly infinite) collection of 
nonoverlapping sets, i.e., such that 


A, A,;= @ forall i¥y, (2.2.5) 
then 
P(U A) = >) P(A)). (2.2.6) 


These are all the axioms needed. Consequentially, however, we have: 
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(iv) if A is the complement of A, i.e., the set of all events not contained in A, 
then 


P(A) = 1 — P(A), (2.2.7) 
(v) P(@) = 0. (2.2.8) 
2.2.2 The Meaning of P(A) 


There is no way of making probability theory correspond to reality without 
requiring a certain degree of intuition. The probability P(A), as axiomatised above, 
is the intuitive probability that an “arbitrary” event @, 1.e., an event @ ‘‘chosen at 
random’’, will satisfy @ © A. Or more explicitly, if we choose an event “‘at random”’ 
from Q N times, the relative frequency that the particular event chosen will satisfy 
@ © A approaches P(A) as the number of times, N, we choose the event, approaches 
infinity. The number of choices N can be visualised as being done one after the 
other (“‘independent”’ tosses of one die) or at the same time (N dice are thrown at the 
same time “‘independently’’). All definitions of this kind must be intuitive, as we 
can see by the way undefined terms (“‘arbitrary’’, “at random’, “‘independent’’) keep 
turning up. By eliminating what we now think of as intuitive ideas and axiomatising 
probability, Kolomogoroy [2.3] cleared the road for a rigorous development of 
mathematical probability. But the circular definition problems posed by wanting 
an intuitive understanding remain. The simplest way of looking at axiomatic pro- 
bability is as a formal method of manipulating probabilities using the axioms. In 
order to apply the theory, the probability space must be defined and the probability 
measure P assigned. These are a priori probabilities, which are simply assumed. 
Examples of such a priori probabilities abound in applied disciplines. For example, 
in equilibrium statistical mechanics one assigns equal probabilities to equal volumes 
of phase space. Einstein’s reasoning in Brownian motion assigned a probability (4) 
to the probability of a “push” 4 from a position x at time tf. 

The task of applying probability is (1) to assume some set of a priori probabilities 
which seem reasonable and to deduce results from this and from the structure of the 
probability space, (ii) to measure experimental results with some apparatus which 
is constructed to measure quantities in accordance with these a priori probabilities. 

The structure of the probability space 1s very important, especially when the 
space of events is compounded by the additional concept of time. This extension 
makes the effective probability space infinite-dimensional, since we can construct 
events such as “‘the particle was at points x, at times ¢,, n = 0, 1, 2, ..., co”. 


2.2.3 The Meaning of the Axioms 


Any intuitive concept of probability gives rise to nonnegative probabilities, and the 
probability that an arbitrary event is contained in the set of all events must be 1 
no matter what our definition of the word arbitrary. Hence, axioms (i) and (ii) are 
understandable. The heart of the matter lies in axiom (iii). Suppose we are dealing 
with only 2 sets A and B, and A . B= @. This means there are no events con- 
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tained in both A and B. Therefore, the probability that@ € A U B is the probabi- 
lity that either @ © A or w € B. Intuitive considerations tell us this probability is 
the sum of the individual probabilities, 1.e., 


P(A U B) = P{(@ € A) or(@ € B)} = P(A) + P(B) (2.2.9) 


(notice this is not a proof—merely an explanation). 

The extension now to any finite number of nonoverlapping sets is obvious, but 
the extension only to any countable number of nonoverlapping sets requires some 
comment. 

This extension must be made restrictive because of the existence of sets labelled 
by a continuous index, for example, x, the position in space. The probability of a 
molecule being in the set whose only element in x is zero; but the probability of 
being in a region R of finite volume ts nonzero. The region R is a union of sets of 
the form {x}—but not a countable union. Thus axiom (iii) is not applicable and the 
probability of being in R is not equal to the sum of the probabilities of beingin {x}. 


2.2.4 Random Variables 


The concept of a random variable is a notational convenience which is central to 
this book. Suppose we have an abstract probability space whose events can be 
written x. Then we can introduce the random variable F(x) which is a function of 
x, which takes on certain values for each x. In particular, the identity function of 
x, written X(x) is of interest; it is given by 


X(x) = x. “e (2.2.10) 


We shall normally use capitals in this book to denote random variables and small 
letters x to denote their values whenever it is necessary to make a distinction. 

Very often, we have some quite different underlying probability space Q with 
values w, and talk about X(@) which is some function of w, and then omit explicit 
mention of w. This can be for either of two reasons: 


i) we specify the events by the values of x anyway, 1.e., we identify x and @; 
ii) the underlying events @ are too complicated to describe, or sometimes, even 
to know. 


For example, in the case of the position of a molecule in a liquid, we really 
should interpret each w as being capable of specifying all the positions, momenta, 
and orientations of each molecule in that volume of liquid; but this is simply too 
difficult to write down, and often unnecessary. 

One great advantage of introducing the concept of a random variable is the 
simplicity with which one may handle functions of random variables, e.g., X?, 
sin(a - X), etc, and compute means and distributions of these. Further, by defining 
stochastic differential equations, one can also quite simply talk about time devel- 
opment of random variables in a way which is quite analogous to the classical 
description by means of differential equations of nonprobabilistic systems. 


2.3 Joint and Conditional Probabilities: Independence 


2.3.1 Joint Probabilities 


We explained in Sect. 2.2.3 how the occurrence of mutually exclusive events is related 
to the concept of nonintersecting sets. We now consider the concept P(A 1 B), where 
A () Bis nonempty. An event @ which satisfies @ € A will only satisfy @ € AQ B 
if @ € Bas well. 


Thus, P(A N B) = P{(w € A) and (a € B)} (2.3.1) 


and P(A (- B) is called the joint probability that the event @ is contained in both 
classes, or, alternatvely, that both the events a € A and w € B occur. Joint pro- 
babilities occur naturally in the context of this book in two ways: 


i) When the event is specified by a vector, e.g., m mice and n tigers. The probability 
of this event is the joint probability of [m mice (and any number of tigers)] and 
[n tigers (and any number of mice)]. All vector specifications are implicitly joint 
probabilities in this sense. 


11) When more than one time is considered: what is the probability that (at time f¢, 
there are m, tigers and n, mice) and (at time ¢, there are m, tigers and n, mice). 
To consider such a probability, we have effectively created out of the events at time 
t, and events at time ¢,, joint events involving one event at each time. In essence, 
there is no difference between these two cases except for the fundamental dynamical 
role of time. 


2.3.2 Conditional Probabilities 


We may specify conditions on the events we are interested in and consider only 
these, e.g., the probability of 21 buffaloes given that we know there are 100 lions. 
What does this mean? Clearly, we will be interested only in those events contained 
in the set B = {all events where exactly 100 lions occur}. This means that we to 
define conditional probabilities, which are defined only on the collection of all sets 
contained in B. we define the conditional probability as 


P(A|B) = P(A Q B)/P(B) (2.3.2) 


and this satisfies our intuitive conception that the conditional probability that 
w & A (given that we know w € B), is given by dividing the probability of joint 
occurrence by the probability (@ € B). 

We can define in both directions, i.e., we have 


P(A (1) B) = P(A|B)P(B) = P(B| A)P(A). (2.3.3) 


There is no particular conceptual difference between, say, the probability of {(21 
buffaloes) given (100 lions)} and the reversed concept. However, when two times 
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are involved, we do see a difference. For example, the probability that a particle is 
at position x, at time ¢,, given that it was at x, at the previous time f, is a very nat- 
ural thing to consider; indeed, it will turn out to be a central concept in this book. 
The converse sounds strange, 1.e., the probability that a particle is at position x, 
at time ¢,, given that it will be at position x, at a later time f,. It smacks of clair- 
voyance—we cannot conceive of any natural way in which we would wish to consi- 
der it, although it is, in principle, a quantity very similar to the “natural” condi- 
tional probability, in which the condition precedes the events under consideration. 

The natural definition has already occurred in this book, for example, the 
¢(4)d4 of Einstein (Sect. 1.2.1.) is the probability that a particle at x at time ¢ will 
be in the range [x + 4, x + 4 + dd] at time ¢ + 1, and similarly in the other 
examples. Our intuition tells us as it told Einstein (as can be seen by reading the 
extract from his paper) that this kind of conditional probability is directly related 
to the time development of a probabilistic system. 


2.3.3 Relationship Between Joint Probabilities of Different Orders 

Suppose we have a collection of sets B, such that 
B, 1 By = O (2.3.4) 
U B,=Q (2.3.5) 


so that the sets divide up the spage Q into nonoverlapping subsets. 
Then 


UAN B=AN(U B)=ANR=A (2.3.6) 


Using now the probability axiom (ill), we see that A  B, satisfy the conditions 
on the A, used there, so that 


2 P(A 1 B) = PLU (4 U BDI (2.3.7) 
= P(A) (2.3.8) 

and thus 
2, P(A| B)P(B) = P(A) (2.3.9) 


Thus, summing over all mutually exclusive possibilities of Bin the joint probability 
eliminates that variable. 
Hence, in general, 


HPAL By Ce.) = PBN CN «..). (2.3.10) 


The result (2.3.9) has very significant consequences in the development of the theory 
of stochastic processes, which depends heavily on joint probabilities. 
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2.3.4 Independence 


We need a probabilistic way of specifying what we mean by independent events. 
Two sets of events A and B should represent independent sets of events if the speci- 
fication that a particular event is contained in B has no influence on the probability 
of that event belonging to A. Thus, the conditional probability P(A |B) should be 
independent of B, and hence 


P(A 1) B) = P(A)P(B) (2.3.11) 


In the case of several events, we need a somewhat stronger specification. The events 
(w € A,) (i = 1,2..., n) will be considered to be independent if for any subset 
(i,;, iz, ..., i) of the set (1,2, ..., n), 


P(A, 1 Ay --» Ay) = P(An)P(Ai) --- P(A) - (2.3.12) 


It is important to require factorisation for all possible combinations, as in (2.3.12). 
For example, for three sets A4,, it is quite conceivable that 


P(A, N A,) = P(A) P(A;) (2.3.13) 
for all different 7 and /, but also that 

A, 1 A, = A,1N A3=Az3N A). (see Fig 2.1) 
This requires 


P(A, ‘a A, ‘a A;) — P(A, a A; ‘a A;) — P(A, ‘a A3) — P(A,)P(A3) (2.3.14) 
# P(A,)P(A,)P(A3). 


We can see that the occurrence of w € A, and w € A, necessarily implies the oc- 

currence of w € A,. In this sense the events are obviously not independent. 
Random variables X,, X,, X3, ..., will be said to be independent random vari- 

ables, if for all sets of the form A, = (x such that a, < x < b,) the events X, € A), 


\\ 


N 


Nb NTN ORIN 


Fig. 2.1. Illustration of statistical independence 
in pairs, but not in threes. In the three sets 
A, 9 A, is, in all cases, the central region. By 
appropriate choice of probabilities, we can 
arrange 


P(A, 1 A,) = P(A,)P(A;) 
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X, € A,, X; © A;, ... are independent events. This will mean that all values of the 


X, are assumed independently of those of the remaining X,. 


2.4 Mean Values and Probability Density 


The mean value of a random variable R(q@) in which the basic events w are coun- 
tably specifiable is given by 


CR) = 21 P(@)R@), (2.4.1) 


where P(q@) means the probability of the set containing only the single event w. In 
the case of a continuous variable, the probability axioms above enable us to define 
a probability density function p(@) such that if A(@ , da) is the set 


(wy) < @ < @ + daw), (2.4.2) 
then 

P(@)da, = P[A(@o, da)] (2.4.3) 

= p(@o, dw) . (2.4.4) 


The last is a notation often used by mathematicians. Details of how this is done 
have been nicely explained by Fé@ller [2.1]. In this case, 


- 


” 


CR) a den R()p(@) . (2.4.5) 


One can often (as mentioned in Sect. 2.2.4) use R itself to specify the event, so we will 
often write 


<R> = J dR Rp(R). (2.4.6) 
Obviously, p(R) is not the same function of R as p(w) is of mmore precisely 


p(R,)dRy = P[Rp < R < R, + dR,)). (2.4.7) 


2.4.1 Determination of Probability Density by Means of Arbitrary Functions 
Suppose for every function f(R) we know 

<f(R)> = J dR f(R) P(R), (2.4.8) 
then we know p(R). The proof follows by choosing 


f(R=1 R<R<R+adR 


= 0 otherwise. 
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Because the expectation of an arbitrary function is sometimes a little easier to work 
with than a density, this relation will be used occasionally in this book. 


2.4.2 Sets of Probability Zero 


If a density p(R) exists, the probability that R is in the interval (Ro, Ro + dR) goes to 
zero with dR. Hence, the probability that R has exactly the value R, is zero; and 
similarly for any other value. 

Thus, in such a case, there are sets S(R,), each containing only one point R,, 
which have zero probability. From probability axiom (111), any countable union of 
such sets, I.e., any set containing only a countable number of points (e.g., all ra- 
tional numbers) has probability zero. In general, all equalities in probability theory 
are at best only “‘almost certainly true’, i.e., they may be untrue on sets of proba- 
bility zero. Alternatively, one says, for example, 


X = Y (with probability 1) (2.4.9) 
which is by no means the same as saying that 
X(R) = Y(R) for all R. (2.4.10) 


Of course, (if the theory is to have any connection with reality) events with proba- 
bility zero do not occur. 

In particular, notice that our previous result if inspected carefully, only implies 
that we know p(R) only with probability 1, given that we know ¢f(R)> for all /(R). 


2.5 Mean Values 


The question of what to measure in a probabilistic system 1s nontrivial. In practice, 
One measures either a set of individual values of a random variable (the number of 
animals of a certain kind in a certain region at certain points in time; the electric 
current passing through a given circuit element in each of a large number of replicas 
of that circuit, etc.) or alternatively, the measuring procedure may implicitly con- 
struct an average of some kind. For example, to measure an electric current, we may 
measure the electric charge transferred and divide by the time taken—this gives a 
measure of the average number of electrons transferred per unit time. It is 1m- 
portant to note the essential difference in this case, that it will not normally be pos- 
sible to measure anything other than a few selected averages and thus, higher 
moments (for example) will be unavailable. 

In contrast, when we measure individual events (as in counting animals), we can 
then construct averages of the observables by the obvious method 


ee se X(n). (2.5.1) 


The quantities X(n) are the individual observed values of the quantity X. We expect 
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that as the number of samples N becomes very large, the quantity X, approaches 
the mean <X) and that, in fact, 


lim Wy > SLX@)] = lim Fy = (fC) (2.5.2) 


and such a procedure will determine the probability density function p(x) of X if 
we carry out this procedure for all functions f. The validity of this procedure 
depends on the degree of independence of the successive measurements and is dealt 
with in Sect. 2.5.2. 

In the case where only averages themselves are directly determined by the meas- 
uring method, it will not normally be possible to measure X(n) and therefore, it will 
not, in general, be possible to determine f(X),,. All that will be available will be 
f(Xy)—quite a different thing unless f is linear. We can often find situations in 
which measurable quantities are related (by means of some theory) to mean values 
of certain functions, but to hope to measure, for example, the mean value of an 
arbitrary function of the number of electrons in a conductor is quite hopeless. The 
mean number—yes, and indeed even the mean square number, but the measuring 
methods available are not direct. We do not enumerate the individual numbers of 
electrons at different times and hence arbitrary functions are not attainable. 


2.5.1 Moments, Correlations, and Covariances 


Quantities of interest are given by the moments <X") since these are often easily 
calculated. However, probability densities must always vanish as x —- -+ ©c0, So we 
see that higher moments tell us only about the properties ofunlikely large values of 
X. In practice we find that the most important quantities are related to the first 
and second moments. In particular, for a single variable X, the variance defined by 


var{X} = {of X]}}? = (XY - COP. (2.5.3) 


and as is well known, the variance var {X} or its square root the standard deviation 
o[X], is a measure of the degree to which the values of X deviate from the mean 


value (XY). 
In the case of several variables, we define the covariance matrix as 
(X,, Xj) = (M, — (XD) (AG — (XD) CX — (XD KX - (2.5.4) 
' Obviously, 
(X,, X;> = var {X;} . (2.5.5) 


If the variables are independent in pairs, the covariance matrix is diagonal. 


2.5.2 The Law of Large Numbers 


As an application of the previous concepts, let us investigate the following model 
of measurement. We assume that we measure the same quantity N times, obtaining 
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sample values of the random variable X(n) (n = 1, 2, ..., N). Since these are al} 
measurements of the same quantity at successive times, we assume that for every 
n, X(n) has the same probability distribution but we do not assume the X(n) to be 
independent. However, provided the covariance matrix (X(n), X(m)> vanishes suf- 
ficiently rapidly as |n — m| — oo, then defining 


fy=1> Xin), (2.5.6 


we Shall show 


lim Xy = <X). (2.5.7) 


N--co 


It is clear that 
CO Oe (2.5.8) 


We now calculate the variance of X, and show that as N — oo it vanishes under 
certain conditions: 


bs Gee Oa oe Rca Oe (2.5.9) 


Provided <X,, X,,) falls off sufficiently rapidly as |n — m| — oo, we find 


lim (var {Xy}) = 0 (2.5.10) 


N—- co 


so that lim X, is a deterministic variable equal to (X). 
N-+0o 


Two models of <X,, X,> can be chosen. 
a) <(X,, Xn) ~ KA (<1) (2.5.11) 


for which one finds 


var {Xy} = aaa) — = —0. (2.5.12) 
b) <X,, Xn ~ |n—m|7 (n # m) (2.5.13) 

and one finds approximately 

var {Xy} ~ 4 log N— +0. (2.5.14) 


N N 


In both these cases, var{X,} — 0. The rate of convergence is very different. In- 
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terpreting , m as the times at which the measurement is carried out, one sees than 
even very slowly decaying correlations are permissible. The law of large numbers 
comes in many forms, which are nicely summarised by Papoulis [2.2]. The central 
limit theorem is an even more precise result in which the limiting distribution 
function of X, — <X) is determined (see Sect. 2.8.2). 


2.6 Characteristic Function 


One would like a condition where the variables are independent, not just in pairs. 
To this end (and others) we define the characteristic function. 

If s is the vector (s,, 52, ..., 5,), and X the vector of random variables (X,, X2, 
..., X,), then the characteristic function (or moment generating function) is defined 
by 


g(s) = <exp (is - X)) = f dx p(x) exp (is - x). (2.6.1) 


The characteristic function has the following properties [Ref. 2.1, Chap. XV] 


i) 9(0) = 1 
i) gl(s)| <1 
iii) (s) is a uniformly continuous function of its arguments for all finite real s [2.5]. 


iv) If the moments <J] X,"*> exist, then 


C1 xX") = by (-i x Ome (2.6.2) 


v) A sequence of probability densities converges to limiting probability density if 
and only if the corresponding characteristic functions converge to the corresponding 
characteristic function of the limiting probability density. 


vi) Fourier inversion formula 
p(x) = (2x) [ ds g(s) exp (—ix - 5) (2.6.3) 


Because of this inversion formula, ¢(s) determines p(x) with probability 1. Hence, 
the characteristic function does truly characterise the probability density. 


vil) Independent random variables: from the definition of independent random 
variables in Sect. 2.3.4, it follows that the variables X,, X,... are independent if 
and only if 


P(r, X29 see5 Xn) = Pi(%)p2(X2) ai PalXn)s (2.6.4) 


in which case, 
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AS, Sy 2065 Sp) = Pi (51 )b2(S2) vee bn(Sn) (2.6.5) 


(vill) Sum of independent random variables: if X,, X2, ..., are independent ran- 
dom variables and if 


Y=X, (2.6.6) 


and the characteristic function of Y is 


¢,(s) = <exp (isY)) , (2.6.7) 
then ‘ 
$,(s) = I bs) . (2.6.8) 


The characteristic function plays an important role in this book which arises from 
the convergence property (v), which allows us to perform limiting processes on the 
characteristic function rather than the probability distribution itself, and often makes 
proofs easier. Further, the fact that the characteristic function is truly characteristic, 
l.e., the inversion formula (vi), shows that different characteristic functions arise 
from different distributions. As well as this, the straightforward derivation of the 
moments by (2.6.2) makes any determination of the characteristic function directly 
relevant to measurable quantities. 


2.7 Cumulant Generating Function: Correlation Functions and 
Cumulants 


A further important property of the characteristic function arises by considering its 
logarithm 


D(s) = log (s) (2.7.1) 


which is called the cumulant generating function. Let us assume that all moments 
exist so that ¢(s) and hence, @(s), is expandable in a power series which can be 
written as 

i'S52... 


@(s) = Dy i’ 2 (XTXG2 Sa a —_ 


“257, 51m), (2.7.2) 
m,! i=l 


where the quantities (X71X%2 ... X™\) are called the cumulants of the variables 

X. The notation chosen should not be taken to mean that the cumulants are func- 

tions of the particular product of powers of the X; it rather indicates the moment 

of highest order which occurs in their expression in terms of moments. Stratonovich 
a 
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[2.4] also uses the term correlation functions, a term which we shall reserve for 
cumulants which involve more than one X;. For, if the X are all independent, the 
factorisation property (2.6.6) implies that ®(s) (the cumulant generating function) 
is a sum of n terms, each of which is a function of only one s, and hence the coeffi- 
cient of mixed terms, 1.e., the correlation functions (in our terminology) are all zero 
and the converse is also true. Thus, the magnitude of the correlation functions 
is a measure of the degree of correlation. 

The cumulants and correlation functions can be evaluated in terms of moments 
by expanding the characteristic function as a power series: 
r! 


(XM Xm eee X™) m, Im,! se 


m,| Ol" Dams) spasg? ... spe , (2.7.3) 


r=1 r! (ms! 


expanding the logarithm in a power series, and comparing it with (2.7.2) for D(s) 
No simple formula can be given, but the first few cumulants can be exhibited: we 
find 


(X;) = (XD (2.7.4) 
(X,X)) = (XX) — (XX) (2.7.5) 
(X,X,Xx) oe (XXX) _ (XX) (Xi) — (X) (XX) 
— (XX >¢Xy> + 2X MK AX) (2.7.6) 
¢ 


(here, all formulae are valid for any number of equal 1j,k,/). An explicit general 
formula can be given as follows. Suppose we wish to calculate the cumulant 
(X,X,X, ... X,)). The procedure is the following: 


1) write a sequence of nv dots ...... : 
i1) divide into p + 1 subsets by inserting angle brackets 


EE) Cy Ee ee 


ili) distribute the symbols X, ... X, in place of the dots in such a way that all 
different expressions of this kind occur, e.g., 


CX) (X2X3) = (M1) (NGX2) F (X3){X1X2) | 


iv) take the sum of all such terms for a given p. Call this C,(X1, X2, ..., Xn); 
n—1 
v) (X, X2 ... X,) ae (—1)? p!C{(M1, Xo, ..., Xn) - (2.7.7) 


A derivation of this formula was given by Meeron [2.6]. The particular procedure 
is due to van Kampen [2.7]. 


vi) Cumulants in which there is one or more repeated element, e.g., (X?X3;X2): 
simply evaluate (X,X,X3X,) and set X, = X, in the resulting expression. 
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2.7.1 Example: Cumulant of Order 4: (X,X2X3X,) 


a) p=0 
only term is (X,X1X3X4) = Co X1X2X5X4). 
b) p=! 


partition ¢.>¢...) 
Term {(Xy)¢(X2X3X a> + (X22 XaNGM 1) + 4X3) (XXX) 
+ (XY {X1X2Xy)} = D, 


partition ¢. .><. .» 
Term (X,X.>¢(X3X4> + (XX (XX + (MX CANS = D2. 


Hence, 


D, ay D, ae CX X2X3X4) . 
c)p=2 


partition <.)<¢.)¢..)> 
Term ¢(X1)<¢X2){XaXa) + CX) CX9) AX) M1) CX) (40N5) 
+ XC X35) M1 Xa) CMD MD MMS) OX) Xa) MM) 
= C,(X,X2X3X,) . 
d) p= 3 


partition <.)¢.9¢.<.) 
Term (X,>¢X2)¢X3)¢Xq) = CX X2X3X4) - 


Hence, 


(X, X2X3X4)) = Co a C; -t- 2C2 a 6C, (2.7.8 
2.7.2 Significance of Cumulants 


From (2.7.4, 5) we see that the first two cumulants are the means (X,) and co 
variances (X,, X,>. Higher-order cumulants contain information of decreasing 
significance, unlike higher-order moments. We cannot set all moments higher thar 
a certain order equal to zero since <X*") > ¢(X")? and thus, all moments contair 
information about lower moments. 

For cumulants, however, we can consistently set 


(xX) =a 
(X*) = 0? 
(X") =0(1 > 2), 
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and we can easily deduce by using the inversion formula for the characteristic func- 
tion that 


p(x) = a= expl— (x — a)¥/204I, 2.7.9) 


a Gaussian probability distribution. It does not, however, seem possible to give 
more than this intuitive justification. Indeed, the theorem of Marcienkiewicz [2.8,9] 
shows that the cumulant generating function cannot be a polynomial of degree 
greater than 2, that 1s, either all but the first 2 cumulants vanish or there are an 
infinite number of nonvanishing cumulants. The greatest significance of cumulants 
lies in the definition of the correlation functions of different variables in terms of 
them; this leads further to important approximation methods. 


2.8 Gaussian and Poissonian Probability Distributions 


2.8.1 The Gaussian Distribution 


By far the most important probability distribution is the Gaussian, or normal 
distribution. Here we collect together the most important facts about it. 

If X is a vector of n Gaussian random variables, the corresponding multi- 
variate probability density function can be written 


p(x) = ((2ny* det(o)}-expl—}e — Fox — 2) (2.8.1) 
so that ° 

(X) = fdxxp(x)=% (2.8.2) 

(XXT) = f de xx'p(x) = #87 +0 (2.8.3) 


and the characteristic function is given by 
g(s) = <exp(is™ X)) = exp(ist ¥ — }s™ as). (2.8.4) 


This particularly simple characteristic function implies that all cumulants of higher 
order than 2 vanish, and hence means that all moments of order higher than 2 are 
expressible in terms of those of order 1 and 2. The relationship (2.8.3) means that o 
is the covariance matrix (as defined in Sect. 2.5.1), i.e., the matrix whose elements 
are the second-order correlation functions. Of course, o is symmetric. 

The precise relationship between the higher moments and the covariance matrix 
ao can be written down straightforwardly by using the relationship between the 
moments and the characteristic function [Sect.2.6 (iv)]. The formula is only simple 
if = 0, in which case the odd moments vanish and the even moments satisfy 


2N)! 
(XK Xe) = IO foyoesnn Say (2.8.4) 
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where the subscript “‘sym’’ means the symmetrised form of the product of o’s, anc 
2N is the order of the moment. For example, 


4! (] 
(X1X2X3X4q) = 42! 3 112034 + 41023 + F130 26] 
= 012034 + F41023 + F130 24 (2.8.5, 
4 4! 
CX) = 4.2! {of} = 307, . (2.8.6 


2.8.2 Central Limit Theorem 


The Gaussian distribution is important for a variety of reasons. Many variables are 
in practice, empirically well approximated by Gaussians and the reason for thi: 
arises from the central limit theorem, which, roughly speaking, asserts that a randon 
variable composed of the sum of many parts, each independent but arbitrarily dis 
tributed, is Gaussian. More precisely, let X1, X2, X3, ..., X, be independent randon 
variables such that 


(X;> = 0, var {X;} = } (2.8.7 
and let the distribution function of X, be p,(x;,). 
Define 
Se > ¥,. (2.8.8 
and 
o? = var {S,} = 3°? . (2.8.9 


We require further the fulfilment of the Lindeberg condition: 


lim E > ff dxx pax) = 0 (2.8.10 


i=1 IxIl>top, 


for any fixed ¢ > 0. Then, under these conditions, the distribution of the normalisec 
sums S,/a, tends to the Gaussian with zero mean and unit variance. 

The proof of the theorem can be found in [2.1]. It is worthwhile commenting 0) 
the hypotheses, however. We first note that the summands X, are required to b 
independent. This condition is not absolutely necessary; for example, choose 


Gate (2.8.11 
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where the Y, are independent. Since the sum of the X’s can be rewritten as a sum of 
Y’s (with certain finite coefficients), the theorem is still true. 

Roughly speaking, as long as the correlation between X, and X, goes to zero 
sufficiently rapidly as |i—j|—~ co, a central limit theorem will be expected. The Lin- 
deberg condition (2.8.10) is not an obviously understandable condition but is the 
weakest condition which expresses the requirement that the probability for |X,| 
to be large is very small. For example, if all the 5, are infinite or greater than some 
constant C, it is clear that o2 diverges as n —- co. The sum of integrals in (2.8.10) 
is the sum of contributions to variances for all | X,;| > to,, and it is clear that as n 
—- co, each contribution goes to zero. The Lindeberg condition requires the sum of 
all the contributions not to diverge as fast as o2. In practice, it is a rather weak 
requirement; satisfied if | X,| < C for all X,, or if p(x) go to zero sufficiently rapidly 
as x + + oo. An exception 1s 


pix) = aln(x? + af)" ; | (2.8.12) 


the Cauchy, or Lorentzian distribution. The variance of this distribution is infinite 
and, in fact, the sum of all the X, has a distribution of the same form as (2.8.12) 


with a, replaced by x a; Obviously, the Lindeberg condition ts not satisfied. 
i=1 


A related condition, also known as the Lindeberg condition, will arise in Sect. 
3.3.1. where we discuss the replacement of a discrete process by one with con- 
tinuous steps. 


2.8.3 The Poisson Distribution 


“? 


A distribution which plays a central role in the study of random variables which 
take on positive integer values is the Poisson distribution. If X is the relevant 
variable the Poisson distribution is defined by 

P(X = x) = P(x) = e%a"/x! (2.8.13) 
and clearly, the factorial moments, defined by 

CX"), = (x(x — 1)...(4-—r4 1), (2.8.14) 
are given by 

(Xs =a", (2.8.15) 


For variables whose range is nonnegative integral, we can very naturally define the 
generating function 


G(s) = = s*P(x) = ¢s*) (2.8.16) 


which is related to the characteristic function by 
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G(s) = ¢(—i log s). (2.8.17) 


The generating function has the useful property that 


rn _ | {e\' 
Cpe (5) a)| (2.8.18) 
For the Poisson distribution we have 
oe *(sa)* 
G(s) = >| —,— = expla(s — 1) . (2.8.19) 
x=0 


We may also defsne the factorial cumulant generating function g(s) by 
g(s) = log G(s) (2.8.20) 
and the factorial cumulants (X"), by 


a(s) = 5 (x), SS. 


r! 


We see that the Poisson distribution has all but the first factorial cumulant zero. 

The Poisson distribution arises naturally in very many contexts, for example, 
we have already met it in Sect.1.2.3 as the solution of a simple master equation. 
It plays a similar central role in the study of random variables which take on integer 
values to that occupied by the Gaussian distribution in the study of variables with 
a continuous range. However, the only simple multivariate generalisation of the 
Poisson is simply a product of Poissons, 1.e., of the form 


e~%i(a;)*t 
x,! 


P(x, X2; X3> sa) = i . (2.8.21) 


There is no logical concept of a correlated multipoissonian distribution, similar to 
that of a correlated multivariate Gaussian distribution. 


2.9 Limits of Sequences of Random Variables 


Much of computational work consists of determining approximations to random 
variables, in which the concept of a /imit of a sequence of random variables naturally 
arises. However, there is no unique way of defining such a limit. 

For, suppose we have a probability space 2, and a sequence of random vari- 
ables X,, defined on Q. Then by the limit of the sequence as n —- co 


X=Ilim X,, (2.9.1) 


we mean a random variable X which, in some sense, 1s approached by the sequence 
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of random variables X,. The various possibilities arise when one considers that the 


probability space 2 has elements w which have a probability density p(w). Then we 
can choose the following definitions. 


2.9.1 Almost Certain Limit 


X, converges almost certainly to X if, for all @ except a set of probability zero 


lim X,(w) = X(@). (2.9.2) 


vi-7 CO 


Thus each realisation of X, converges to X and we write 


ac-lim X, = X (2.9.3) 


vi CO 


2.9.2 Mean Square Limit (Limit in the Mean) 


Another possibility is to regard the X,(w) as functions of @, and look for the 
mean square deviation of X,(w) from X(q). Thus, we say that X, converges to X in 
the mean square if 


lim f dw p()[X,(@) — X(o)? = lim (CX, — X)) = 0 (2.9.4) 
n—00 ‘gy nce 


This is the kind of limit whtfch is well known in Hilbert space theory. We write 


ms-lim XY, = X (2.9.5) 


m-70o 


2.9.3 Stochastic Limit, or Limit in Probability 


We can consider the possibility that X,(w) approaches X because the probability of 
deviation from X approaches zero: precisely, this means that if for any e > 0 


lim P(|X, —X| > 2) =0 (2.9.6) 


then the stochastic limit of X, 1s X. 


Note that the probability can be written as follows. Suppose  7,(z) a function 
such that 


xA(z) = | | Z| > € 
0 ee (2.9.7) 


Then 


P(|X, — X| > &) = J dw p)x.(|X, — X1)- (2.9.8) 
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In this case, we write 


st-lim X, = X (2.9.9) 


2.9.4 Limit in Distribution 


An even weaker form of convergence occurs if, for any continuous bounded 
function f(x) 


lim (f(X,)> = (0). (2.9.10) 


In this case the convergence of the limit is said to be in distribution. In particular, 
using exp(ixs) for f(x), we find that the characteristic functions approach each 
other, and hence the probability density of X, approaches that of X. 


2.9.5 Relationship Between Limits 


The following relations can be shown. 


Almost certain convergence —> stochastic convergence. 
Convergence in mean square —> stochastic convergence. 
Stochastic convergence ——> convergence in distribution. 


All of these limits have uses in applications. 


3. Markov Processes 


3.1 Stochastic Processes 


All of the examples given in Chap. | can be mathematically described as stochastic 
processes by which we mean, in a loose sense, systems which evolve probabilistically 
in time or more precisely, systems in which a certain time-dependent random 
variable X(t) exists. We can measure values x,, X2, X3, ..., etc., of X(t) at times 4, 
tn, t3, ... and we assume that a set of joint probability densities exists 


D Xig tis Boy ts Rats) (3.1.1) 


which describe the system completely. 
In terms of these joint probability density functions, one can also define condi- 
tional probability densities: 


PUX1, t1,5 X25 t25 --- [Vis T13 V2» T23 oe 


= P(X, 13 Xa, bes 5 Vis Ta V2s T23 --)/PCV1, T13 Vas T23 +++) (3.1.2) 


These definitions are valid independently of the ordering of the times, although it is 
usual to consider only times which increase from right to left i.e., 


t2nH2HD2..2AMA7ND.... (3.1.3) 


The concept of an evolution equation leads us to consider the conditional probabill- 
ties as predictions of the future values of X(t) (i.e., x,, x2, ... at times ¢,, t,, ...), given 
the knowledge of the past (values y,, y2, ..., at times 7,72, ...). 

The concept of a general stochastic process is very loose. To define the process 
we need to know at least all possible joint probabilities of the kind in (3.1.1). If such 
knowledge does define the process, it is known as a separable stochastic process. 
All the processes considered in this book will be assumed to be separable. 

The most simple kind of stochastic process is that of complete independence: 


P(X, t13 X2, ta; Xz, t3; ...) = IT p(s, t;) (3.1 4) 


which means that the value of X at time ft is completely independent of its values in 
the past (or future). An even more special case occurs when the p(x,, t;) are inde- 
pendent of t,, so that the same probability law governs the process at all times. We 
then have the Bernoulli trials, in which a probabilistic process is repeated at succes- 
sive times. 


3.2. Markov Process 4. 


The next most simple idea is that of the Markov process in which knowledge o 
only the present determines the future. 


3.2 Markov Process 


The Markov assumption is formulated in terms of the conditional probabilities. W: 
require that if the times satisfy the ordering (3.1.3), the conditional probability i 
determined entirely by the knowledge of the most recent condition, 1.e., 


P(X1, t13 Xo, to3--- [Pry T13 Vas T23 ++) 
= P(X, ty X2, tr; --- |i, 71). (3.2.1 


This is simply a more precise statement of the assumptions made by Einsteir 
Smoluchowski and others. It is, even by itself, extremely powerful. For it mean 
that we can define everything in terms of the simple conditional probabilitie 
P(x, t1|¥1, 7). For example, by definition of the conditional probability densit 
P(X, t15 Xo, to| Yi, T1) = P(X, th [X2, t25 Vi, T1)p(X2, f2| 1, T,) and using the Marko 
assumption (3.2.1), we find 

P(X, t13 Xo, C23 Vis T1) = P(X, th |X2, t2)p(%2, t2|Yi, TH) (3.2.2 
and it is not difficult to see that an arbitrary joint probability can be expressed sim 
ply as 

P(x, ty ) x2, to; x3, t3; -20 Xny tn) 

= P(X, t|X2, t,)p(Xo, t2|X3, ts)p(%s, ts |X4, ta) --- (3.2.2 
soe P(Xn-15 tn-1 [Xn tn)P(Xns tn) 


provided 


2h ly So tc Se hee (3.2.4 


3.2.1 Consistency—the Chapman-Kolmogoroy Equation 


From Sect.2.3.3 we require that summing over all mutually exclusive events ¢ 
one kind in a joint probability eliminates that variable, 1.e., 


UPANBNC..)=PANC...); (3.2. 


and when this is applied to stochastic processes, we get two deceptively similz 
equations: 


P(x, 4) = f AX, P(X1, t)3 X2, tr) 
a J AX P(X, t)|X2, t2)p(X2, tr). (3.2.¢ 
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This equation is an identity valid for all stochastic processes and is the first in a 
hierarchy of equations, the second of which is 


P(x, ty [ x3, ts ) = f dx, P(x, Nh; X2; ta |X, ts) 
= f Ax, P(X,, t|X2, tr; X3, t3)p(X2, t2| X3, ¢3). (3.2.7) 


This equation is also always valid. We now introduce the Markov assumption. If 
t; > t. > t,, we can drop the t,; dependence in the doubly conditioned probability 
and write 


P(X, ty | X35 ts) = J dxz p(x1, ty | X2, t2)p(%2, t2| Xs, ts) (3.2.8) 


which is the Chapman-Kolmogorov equation. 

What is the essential difference between (3.2.8) and (3.2.6)? The obvious answer 
is that (3.2.6) is for unconditioned probabilities, whereas (3.2.7) is for conditional 
probabilities. Equation (3.2.8) is a rather complex nonlinear functional equation 
relating all conditional probabilities p(x;, t;|x,, t;) to each other, whereas (3.2.6) 
simply constructs the one time probabilities in the future ¢, of ft, given the 
conditional probability p(x,, t;| x2, t2). 

The Chapman-Kolmogorov equation has many solutions. These are best under- 
stood by deriving the differential form which is done in Sect. 3.4.1 under certain 
rather mild conditions. 


3.2.2 Discrete State Spaces 


“- 


In the case where we have a discrete variable, we will use the symbol N = (N,, N32, 
N; ...), where the N, are random variables which take on integral values. Clearly, 
we now replace 


JfdxroD (3.2.9) 
and we can now write the Chapman-Kolmogorov equation for such a process as 
P(m,, ty | M3, ts) = >) P(m), t1| M2, t2) P(m2, t2| M3, ts) . (3.2.10) 


This is now a matrix multiplication, with possibly infinite matrices. 


3.2.3 More General Measures 


A more general formulation would assume a measure du(x) instead of dx where a 
variety of choices can be made. For example, if u(x) is a step function with steps at 
integral values of x, we recover the discrete state space form. Most mathematical 
works attempt to be as general as possible. For applications, such generality can 
lead to lack of clarity so, where possible, we will favour a more specific notation. 
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3.3. Continuity in Stochastic Processes 


Whether or not the random variable X(t) has a continuous range of possible values 
is acompletely different question from whether the sample path of X(t) is a continu- 
ous function of t. For example, in a gas composed of molecules with velocities V(t), 
it is clear that all possible values of V(t) are in principle realisable, so that the range 
of V(t) is continuous. However, a model of collisions in a gas of hard spheres as 
occurring instantaneously is often considered, and in such a model the velocity be- 
fore the collision, v;, will change instantaneously at the time of impact to another 
value v;, so the sample path of V(t) is not continuous. Nevertheless, in such a 
model, the position of a gas molecule X(t) would be expected to be continuous. 

A major question now arises. Do Markov processes with continuous sample paths 
actually exist in reality? Notice the combination of Markov and continuous. It is 
almost certainly the case that in a classical picture (i.e., not quantum mechanical), 
all variables with a continuous range have continuous sample paths. Even the hard 
sphere gas mentioned above is an idealisation and more realistically, one should 
allow some potential to act which would continuously deflect the molecules during 
a collision. But it would also be the case that, if we observe on such a fine time scale, 
the process will probably not be Markovian. The immediate history of the whole 
system will almost certainly be required to predict even the probabilistic future. 
This is certainly born out in all attempts to derive Markovian probabilistic equa- 
tions from mechanics. Equations which are derived are rarely truly Markovian— 
rather there is a certain characteristic memory time during which the previous 
history is important (Haake [3.1]). 

This means that there is really no such thing as a Markov process; rather, 
there may be systems whose memory time is so small that, on the time scale on 
which we carry out observations, it is fair to regard them as being well appro- 
ximated by a Markov process. But in this case, the question of whether the sample 
paths are actually continuous is not relevant. The sample paths of the approxi- 
mating Markov process certainly need not be continuous. Even if collisions of mole- 
cules are not accurately modelled by hard spheres, during the time taken for a 
collision, a finite change of velocity takes place and this will appear in the appro- 
ximating Markov. process as a discrete step. On this time scale, even the position 
may change discontinuously, thus giving the picture of Brownian motion as 
modelled by Einstein. 

In chemical reactions, for example, the time taken for an individual reaction to 
proceed to completion—roughly of the same order of magnitude as the collision 
time for molecules—provides yet another minimum time, since during this time, 
states which cannot be described in terms of individual molecules exist. Here, there- 
fore, the very description of the state in terms of individual molecules requires a 
certain minimum time scale to be considered. 

However, Markov processes with continuous sample paths do exist mathema- 
tically and are useful in describing reality. The model of the gas mentioned above 
provides a useful example. The position of the molecule is indeed probably best 
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modelled as changing discontinuously by discrete jumps. Compared to the distances 
travelled, however, these jumps are infinitesimal and a continuous curve provides 
a good approximation to the sample path. On the other hand, the velocities can 
change by amounts which are of the same order of magnitude as typical values at- 
tained in practice. The average velocity of a molecule in a gas is about 1000 m/s 
and during a collision can easily reverse its sign. The velocities simply cannot reach 
(with any significant probability) values for which the changes of velocity can be 
regarded as very small. Hence, there is no sense in a continuous path description of 
velocities in a gas. 


3.3.1 Mathematical Definition of a Continuous Markov Process 


For a Markov process, it can be shown [3.2] that with probability one, the sample 
paths are continuous functions of ¢t, if for any e > 0 we have 


im ¥ f dx p(x, t+ At|z,t)=0 (3.3.1) 


Art-—-0 |x-—zl>e 


uniformly in z, ¢ and At. 

This means that the probability for the final position x to be finitely different 
from z goes to zero faster thaf\At, as At goes to zero. [Equation (3.3.1) is sometimes 
called the Lindeberg condition.] 


Examples q 
i) Einstein’s solution for his f(x, t) (Sect. 1.2.1) is really the conditional probability 
p(x, t|0, 0). Following his method we would find 


p(x, t+ At|z, t) = (4nDAt)“'” exp [— (x — z)’/4DAt)] (3.3.2) 


and it is easy to check that (3.3.1) is satisfied in this case. Thus, Brownian motion 
in Einstein’s formulation has continuous sample paths. 


11) Cauchy Process: Suppose 
At 
p(x, t + At|z, t) = ps (x — z* + At?]. (3.3.3) 


Then this does not satisfy (3.3.1) so the sample paths are discontinuous. 
However, in both cases, we have as required for consistency 


lim p(x, t + At|z, t) = &(x —z), (3.3.4) 
Ar—-0 


and it is easy to show that in both cases, the Chapman-Kolomogorov equation is 
Satisfied. 


The difference between the two processes just described is illustrated in Fig. 3.1 
in which simulations of both processes are given. The difference between the two is 


3.4 Differential Chapman-Kolmogorov Equation 47 


Fig. 3.1. Illustration of sample paths of 
the Cauchy process X(t) (----- ) and 
Brownian motion W(t) (———) 


t * 


striking. Notice, however, that even the Brownian motion curve is extremely irre- 
gular, even though continuous—in fact it is nowhere differentiable. The Cauchy- 
process curve is, of course, wildly discontinuous. 


3.4 Differential Chapman-Kolmogorov Equation 


Under appropriate assumptions, the Chapman-Kolmogorov equation can be re- 
duced to a differential equation. The assumptions made are closely connected with 
the continuity properties of the process under consideration. Because of the form 
of the continuity condition (3.3.1), one is led to consider a method of dividing 
the differentiability conditions into parts, one corresponding to continuous motion 
of a representative point and the other to discontinuous motion. 

We require the following conditions for alle > 0: 


i) lim p(x, t+ At|z, t)/At = W(x|z, t) (3.4.1) 
i) 


uniformly in x, z, and ¢tfor |x — z| Se; 


ii) lim Af de (x — z)p(x, t+ Atlz, 1) = Af, t) + O© ; (3.4.2) 


0 Ix-zl<e 


ii) lim Af dx (x, — 2) (x) — z)p(x, t + Atle, t) = By(z, t) + O}); (3.4.3) 


0 lx—zI<e 


the last two being uniform in Z, €, and f¢. 
Notice that all higher-order coefficients of the form in (3.4.2,3) must vanish. For 
example, consider the third-order quantity defined by 


lim Ay fx q — 2%) — Me — 24) PCH t+ ALL Z, 1) 


Ar-0 [x—zl<e 


> 


= C,,,(z, t) + Ole). (3.4.4) 
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Since C;,,, is symmetric in i, 7, k, consider 


DY a 0,0,C 1%, t) = Ca, Z, t) (3.4.5) 
Lj,K 
so that 
Cit le ee t) (3.4.6) 
ENED Bh 0a,0a 0a, ee _ 
Then, 
| C(a@, z, t)| < lim i. f |a-(x — z)|[a-(x — z)]} p(x, t+ At|z, t) dx 
Ar—0 At ix-zi<e 
+ Of) 


< Ja le lim f [a-(x — 2)Pp(x, t+ Atlz, dx + OC) 
t—0 


= ¢|a@|[a,a,B,(z, t) + O(e)] + Ofc) 
= O(e) (3.4.7) 


so that C is zero. Similary, we can show that all corresponding higher-order quanti- 
ties also vanish. 

According to the condition for continuity (3.3.1), the process can only have con- 
tinuous paths if W(x|z, t) vanishes for all x # z. Thus, this function must in some 
way describe discontinuous motion, while the quantities“A, and B,, must be 
connected with continuous motion. 


3.4.1 Derivation of the Differential Chapman-Kolmogorov Equation 
We consider the time evolution of the expectation of a function f(z) which is 


twice continuously differentiable. 
Thus, 


0, J dx f(x)p(x, tly, ¢') 


= lim {{ dx f(x)[p(x, t + Atly, t’) — p(x, tly, t’)]} /At (3.4.8) 
es lim {{ dx { dz f(x)p(x, t+ At|z, t)p(z, tly, t’) 
— J dz f(z)p(z, tly, t’)} /At, (3.4.9) 


where we have used the Chapman-Kolmogorov equation in the positive term of 
(3.4.8) to produce the corresponding term in (3.4.9). 

We now divide the integral over x into two regions |x — z| > eand |x — Z| 
< e. When |x — z| <4, since f(z) is, by assumption, twice continuously differen- 
tiable, we may write 
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fis) =f@) + SFP (x, = 2) + Oy FE i — ede — 2) 


+ |x— ae Zz), (3.4.10) 
where we have (again by the twice continuous differentiability) 
| R(x, z)| ~0O as |x—z|—0O. (3.4.11) 


Now substitute in (3.4.9): 


(3.4.9) = tim | [[ dxdz baer a e ge + ay 5 Cae Z;)(X; — Z;) noe, 


Ix—zl<e 2 
xX p(x, t + At|z, t) p(z, tly, t’) 
+ ff dxdz |x — z|?R(x, z)p(x, t+ At|z, t)p(z, tly, t’) 


|x—zl<e 


+ ff dxdz f(x)p(x, t+ At|z, tp(z, tly, t’) 


Ix—212e 


+ ff dxdz f(z)p(x, t + At|z, t)p(z, tly, t’/) 


[x—zl <e 


— ff dxdzf(z)p(x, t + At|z, t)p(z, tly, t’) (3.4.12) 
[notice that since p(x, t + At|z, ¢) 1s a probability, the integral over x in the last 
term gives 1—this is simply the last term in (3.4.9)]. 


We now consider these line by line. 


Lines 1,2: by the assumed uniform convergence, we take the limit inside the integral 
to obtain [using conditions (11) and (iil) of Sect. 3.4] 


f de| 3 A(z) a— si + zz B,,(Z) oF | ve, tly, t') + Oe). (3.4.13) 


Line 3: this is a remainder term and vanishes as e — 0. For 


lx OS: dx |x — z|?R(x, z)p(x, t + At|z, 1)| 
At |[x—zl <e 
< Fr {dx |x —z|?p(x, t+ At|z, | Max | R(x, z)| (3.4.14) 
|x—zl<e |x—zl<e 


— (51 Buz, #) + O(6)] {Max | R(x, 2)|}. 
From (3.4.11) we can see that as e — 0, the factor in curly brackets vanishes. 


Lines 4-6: We can put these all together to obtain 


ff dxdz f(z)[W(z|x, t)p(x, t|y, t'!) — W(x|z, t)p(z, tly, t]. (3.4.15) 


[x—zl>e 
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The whole right-hand side of (3.4.12) is independent of e. Hence, taking the limit 
e— 0, we find 


a. § de fle)ple,t1¥ 0°) = fade] ¥ Ale, 0S + 5S: Be) SL | ee, ely, 09 


+ f dz f(z) {fdx[W(x|z, t)p(x, tly, t’) — W(x|z, plz, tly, t)]}. (3.4.16) 


Notice, however, that we use the definition 


lim { dx F(x, z) = fdx F(x, z) (3.4.17) 


e-0 {y-z/>e 


for a principal value integral of a function F(x, z). For (3.4.16) to have any meaning, 
this integral should exist. Equation (3.4.1) defines W(x|z,t) only for x # z and 
hence leaves open the possibility that it 1s infinite at x = z, as is indeed the case 
for the Cauchy process, discussed in Sect. 3.3.1, for which 


W(x|z, t) = 1/[n(x — z)’]. (3.4.18) 


However, if p(x, t| y, t’) is continuous and once differentiable, then the principal 
value integral exists. In the remainder of the book we shall not write this integral 
explicitly as a principal value integral since one rarely considers the singular cases 
for which it is necessary. 

The final step now is to integrate by, parts. We find 


e 


Jdz flz)0.p(z, tly, t') = Sede f@) 5 Ade, D002, t1y, 1 
a aE 2 oa dz, B, f(z, t)p(z, tly,t ‘’) 
+ fdx[W(z|x, t)p(x, t|y, t!) — W(x|z, t)p(z, tly, t)] 
+ surface terms. (3.4.19) 


We have not specified the range of the integrals. Suppose the process is confined to 
a region R with surface S. Then clearly, 


p(x, t|z, t!) = 0 unless both x and z € R. (3.4.20) 
It is clear that by definition we have 

W(x|z, t) = Ounless both x andy € xn. (3.4.21) 
But the conditions on A(z, t)and B,,(z, t) can result in discontinuities in these func- 
tions as defined by (3.4.2.3) since the conditional probability p(x, t + At|z, t’) can 


very reasonably change discontinuously as z crosses the boundary of R, reflecting 
the fact that no transitions are allowed from outside R to inside R. 
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In integrating by parts, we are forced to differentiate both A, and B,, and by our 
reasoning above, one cannot assume that this is possible on the boundary of the 
region. Hence, let us choose f(z) to be arbitrary but nonvanishing only in an ar- 
bitrary region R’ entirely contained in R. We can then deduce that for all z in the 
interior of R, 


7 7) / 
0,p(z, t|y, t’) = az, [A,(z, t)p(z, tly, t’')] 


1 @ 
si 2 2 9202, [B,j(z, t)p(z, tly, t)] (3.4.22) 


+ f gx (W(z|x, t)p(x, tly, t') — Wxlz, Oplz, tly, t)). 


Surface terms do not arise, since they necessarily vanish. 

This equation does not seem to have any agreed name in the literature. Since it 
is purely a differential form of the Chapman-Kolmogorov equation, I propose to 
call it the differential Chapman-Kolmogorov equation. 


3.4.2 Status of the Differential Chapman-Kolmogorov Equation 


From our derivation it is not clear to what extent solutions of the differential 
Chapman-Kolmogorov equation are solutions of the Chapman-Kolmogorov equ- 
ation itself or indeed, to what extent solutions exist. It is certainly true, however, 
that a set of conditional probabilities which obey the Chapman-Kolmogorov 
equation does generate a Markov process, in the sense that the joint probabilities 
so generated satisfy all probability axioms. 

It can be shown [3.3] that, under certain conditions, if we specify A(x, t), B(x, t) 
(which must be positive semi-definite), and W(x| y, t) (which must be non-negative), 
that a non-negative solution to the differential Chapman-Kolmogorov equation 
exists, and this solution also satisfies the Chapman-Kolmogorov equation. The 
conditions to be satisfied are the initial condition, 


P(z, tly, t) oa o(y a z) 


which follows from the definition of the conditional probability density, and any 
appropriate boundary conditions. These are very difficult to specify in the full 
equation, but in the case of the Fokker-Planck equation (Sect. 3.5.2) are given in 
Chap. 5. 


3.5 Interpretation of Conditions and Results 


Each of the conditions (1), (11), (iii) of Sect. 3.4 can now be seen to give rise to 
a distinctive part of the equation, whose interpretation is rather straightforward. 
We can identify three processes taking place, which are known as jumps, drift 
and diffusion. 
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3.5.1 Jump Processes: The Master Equation 
We consider a case in which 
A(z, t) = B,(z, t) = 0 (3.5.1) 


so that we now have the Master equation: 


0, p(z, tly, t') = J dx (W(z|x, t)p(x, tly, t') — W(x|z, t)p(z, tly, t)). (3.5.2) 


To first order in At we solve approximately, as follows. Notice that 


P(z, tly, t) = (y — Zz). (3.5.3) 
Hence, 
p(z,t + Atly, t) = &y — z)[l — J dx Wixly, NAtq] + Waly, At. (3.5.4) 


We see that for any Af there is a finite probability, given by the coefficient of the 
5(y — z) in (3.5.4), for the particle to stay at the original position y. The dis- 
tribution of those particles which do not remain at y is given by W(z|y, t) after 
appropriate normalisation. Thus, a typical path X(t) will consist of sections of 
straight lines X(t) = constant, interspersed with discontinuous jumps whose dis- 
tribution is given by W(z|y, t). For this reason, the process is known as a jump 
process. The paths are discontinuous at discrete points. 

In the case where the state space cdnsists of integers only, the Master equation 
takes the form 


0, P(n, t|n’, t'!) = >) [W(a|m, t)P(n, t|n', 0!) — Wim|n, t)P(a, tn’, t’)). (3.5.5) 


There is no longer any question that only jumps can occur, since only discrete values 
of the state variable M(t) are allowed. It is most important, however, to be aware 
that a pure jump process can occur even though the variable X(t) can take on a con- 
tinuous range of variables. 


3.5.2 Diffusion Processes—the Fokker-Planck Equation 


If we assume the quantities W(z| x, t) to be zero, the differential Chapman-Kolmo- 
gorov equation reduces to the Fokker-Planck equation: 


Op(z, tlyst’) oo 0 
Ae = * a, [A(z, t)p(z, tly, t’)) 


5 Dae eae [Bi(z, t)p (z, tly, t)] (3.5.6) 


and the corresponding process is known mathematically as a diffusion process. The 
vector A(z, t) is known as the drift vector and the matrix B(z, t) as the diffusion 
matrix. The diffusion matrix is positive semidefinite and symmetric as a result of its 
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definition in (3.4.3). It is easy to see that from the definition of W(x|z, t) (3.4.1), 
the requirement (3.3.1) for continuity of the sample paths is satisfied if W(x|z, t) is 
zero. Hence, the Fokker-Planck equation describes a process in which X(t) has con- 
tinuous sample paths. In fact, we can heuristically give a much more definite des- 
cription of the process. Let us consider computing p(z, t + At|y, ¢), given that 


P(z, tly, t) = 5(z — y). (3.5.7) 


For a small Aft, the solution of the Fokker-Planck equation will still be on the 
whole sharply peaked, and hence derivatives of A,(z, t) and B,,(z, t) will be negli- 
gible compared to those of p. We are thus reduced to solving, approximately 


Op(z, tly, t’ ° Op(z, tly, t’ | 0*p(z, tly, t’ 
(3.5.8) 


where we have also neglected the time dependence of A, and B,, for small ¢ — t¢’. 
Equation (3.5.8) can now be solved, subject to the initial condition (3.5.7), and 
we get 


p(z, t + At|y, t) = (2n)-¥? {det[B(y, t)]} [Ar]? 


I{zk—y — Ay, toatl (By, O) lz — y — Ay, t)At) 
2 


7 , (3.5.9) 


x exp{— 
that is, a Gaussian distribution with variance matrix B(y, t)and mean y + A(y, t)At. 
We get the picture of the system moving with a systematic drift, whose velocity is 
A(y, t), on which is superimposed a Gaussian fluctuation with covariance matrix 
B(y, t)At, that 1s, we can write 


y(t + At) = y(t) + A(y(t), VAL + a(e)At'”, (3.5.10) 
where <n(t)> = 0 (3.5.11) 
<n(t)n(t)"> = By, t). (3.5.12) 


It is easy to see that this picture gives 


1) sample paths which are always continuous — for, clearly, as At — 0, y(t + At) 
— y(t); 

11) sample paths which are nowhere diffierentiable, because of the Ar!/* occurring in 
(3.5.10). 


We shall see later, in Chap. 4 that the heuristic picture of (3.5.10) can be made 
much more precise and leads to the concept of the stochastic differential equation. 


3.5.3 Deterministic Processes—Liouville’s Equation 


It is possible that in the differential Chapman-Kolmogorov equation (3.4.22) only 
the first term is nonzero, so we are led to the special case of a Liouville equation: 
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eV FY yn S tAde, ple, ty, 1 (3.5.13) 
t 7 OZ, 


which occurs in classical mechanics. This equation describes a completely deter- 
ministic motion, i.e., if x(y, t) is the solution of the ordinary differential equation 


ND Alx(t), t] (3.5.14) 


with x(y, t)) = y, (3.5.15) 


then the solution to (3.5.13) with initial condition 


P(z, t'|y, t') = 8 — y) (3.5.16) 


P(z, tly, t') = d[z — x(y, 2)]. (3.5.17) 


The proof of this assertion is best obtained by direct substituion. For 


Sy {A,(z, t)8[z — x(y, 2)]} (3.5.18) 
= SZ (Ale, 9, ble — x(y, 0) (3.5.19) 
= -5 [Alely, 9,152 ole — ay, 0 . (3.5.20) 
and 
2 ate — xy, 0 = EZ le — ay, N) SO (3.5.21) 


and by use of (3.5.14), we see that (3.5.20,21) are equal. Thus, if the particle is in a 
well-defined initial position y at time ¢’, 1t stays on the trajectory obtained by solving 
the ordinary differential equation (3.5.14). 

Hence, deterministic motion, as defined by a first-order differential equation of 
the form (3.5.14), is an elementary form of Markov process. The solution (3.5.17) 
is, of course, merely a special case of the kind of process approximated by equations 
like (3.5.9) in which the Gaussian part 1s zero. 


3.5.4 General Processes 


In general, none of the quantities in A(z, t), B(z, t) and W(x|z, t) need vanish, and 
in this case we obtain a process whose sample paths are as illustrated in Fig. 3.2, 
1.e., a piecewise continuous path made up of pieces which correspond to a diffusion 
process with a nonzero drift, onto which is superimposed a fluctuating part. 
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Fig. 3.2. Illustration ofa sample path 
of a general Markov process, in which 
drift, diffusion and jumps exist 


Z(t) 


Z(t) 


Fig. 3.3. Sample path of a Markov 
process with only drift and jumps 


t 


It is also possible that A(z, t) is nonzero, but B(z, t) is zero and here the sample 
paths are, as in Fig. 3.3, composed of pieces of smooth curve [solutions of (3.5.14)] 
with discontinuities superimposed. This is very like the picture one would expect 
in a dilute gas where the particles move freely between collisions which cause an 
instantaneous change in momentum, though not position. 


3.6 Equations for Time Development in Initial Time—Backward 
Equations 


We can derive much more simply than in Sect. 3.4, some equations which give the 
time development with respect to the initial variables y, t’ of p(x, t|y, t’). 
We consider 


I 
lim Fale, tly, tf + At’) — px, tly, t°)] (3.6.1) 
Ar/—0 t 


ol 
= lim Aa J dz plz, t’ + At’ ly, t’VEp(e, tly, t+ At’) 
At/—0 t 
— p(x, t|z, t’ + At’)] (3.6.2) 


by use of the Chapman-Kolmogrov equation in the second term and by noting 
that the first term gives 1 x p(x, t|y, t’ + At’). 
The assumptions that are necessary are now the existence of all relevant deriva- 
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tives, and that p(x, t|y, t’) is continuous and bounded in x,t,t’ for some range 
t— t'>68> 0. We may then write 


= him LG , | dz p(z, t’ + At’ ly, t’) [plx, tly, 0’) — p(x, t|z, t’)] (3.6.3) 


We now proceed using similar techniques to those used in Sect. 3.4.1 and finally 
derive 


Op(x, tly, t’) t’ Op(x, tly, ty) 1 ’ 0*p(x, tly, t’) 
gh aE) 2 2 Buy aay, 


+ f dz Wzly, t’) [p(x, tly, t’) — p(x, tlz, t)] (3.6.4) 


which will be called the backward differential Chapman-Kolmogorov equation. In 
a mathematical sense, it is better defined than the corresponding forward equation 
(3.4.22). The appropriate initial condition for both equation is 


p(x, t|y, t) = d(x — y) for all z, (3.6.5) 


representing the obvious fact that if the particle 1s at y at time ¢, the probability 
density for finding it at x at the same time 1s 6(x—y). 

The forward and the backward equations are equivalent to each other. For, 
solutions of the forward equation, subject to the initial condition (3.6.5) [or 3.5.4], 
and any appropriate boundary conditions, yield solutions of the Chapman- 
Kolmogorov equation, as noted in Sect. 3.4.2. But these have, just been shown to 
yield the backward equation. (The relation between appropriate boundary condi- 
tions for the Fokker-Planck equations is dealt with in Sect. 5.2.1,4). The basic dif- 
ference is which set of variables is held fixed. In the case of the forward equation, 
we hold (y, t’) fixed, and solutions exist for t > t’, so that (3.6.5) is an initial condi- 
tion for the forward equation. For the backward equation, solutions exist for t’ < 1, 
so that since the backward equation expresses development in t’, (3.6.5) is really 
better termed final condition in this case. 

Since they are equivalent, the forward and backward equations are both useful. 
The forward equation gives more directly the values of measurable quantities as a 
function of the observed time, tf, and tends to be used more commonly in applica- 
tions. The backward equation finds most application in the study of first passage 
time or exit problems, in which we find the probability that a particle leaves a 
region in a given time. 


3.7 Stationary and Homogeneous Markov Processes 


In Sect. 1.4.3 we met the concept of a stationary process, which represents the 
stochastic motion of a system which has settled down to a steady state, and whose 
stochastic properties are independent of when they are measured. Stationarity 
can be defined in various degrees, but we shall reserve the term “‘stationary process” 


3.7 Stationary and Homogeneous Markov Processes 57 


for a strict definition, namely, a stochastic process X(t) is stationary if X(t) and the 
process X(t + €) have the same statistics for any e. This is equivalent to saying 
that all joint probability densities satisfy time translation invariance, i.e., 


P(x, ty ; x2, to; x3; ts; wey Xny tn) 
= PX, t, + €;Xq, ty + 5X3, ty + €5 ...5 Xn bn + €) (3.7.1) 


and hence such probabilities are only functions of the time differences, t, — t,. In 
particular, the one-time probability is independent of time and can be simply 
written as 


p(x) 4 (3.7.2) 
and the two-time joint probability as 

p(x, t ~~ to; X25 QO). (3.7.3) 
Finally, the conditional probability can also be written as 

p(x, t; — t2|X2, 0). (3.7.4) 


For a Markov process, since all joint probabilities can be written as products of the 
two-time conditional probability and the one-time probability, a necessary and 
sufficient condition for stationarity is the ability to write the one and two-time 
probabilities in the forms given in (3.7.1-3). 


3.7.1 Ergodic Properties 


If we have a stationary process, it is reasonable to expect that average measurements 
could be constructed by taking values of the variable x at successive times, and 
averaging various functions of these. This is effectively a belief that the law of 
large numbers (as explained in Sect. 2.5.2) applies to the variables defined by 
successive measurements In a stochastic process. 

Let us define the variable X(T) by 


X(T) = 7 fa x(t) , (3.7.5) 


where x(t) is a stationary process, and consider the limit T — co. This represents a 
possible model of measurement of the mean by averaging over all times. Clearly 


(X(t)> = <x). (3.7.6) 


We now calculate the variance of X(T). Thus, 


(KD) = gp [J dtudts Cxltd)x(t0) (3.7.7) 
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and if the process is stationary, 
(x(t) x(t2)) = R(t, — th) + <x)’, (3.7.8) 


where R is the two-time correlation function. Hence, 
7 1 2 
(X(T) — <x)? = ip J dt R(t)(2T — |t|) (3.7.8) 


where the last factor follows by changing variables to 


T=, — f, (3.7.9) 
f= t, 


and integrating ¢. 

The left-hand side is now the variance of X(T) and one can show that under 
certain conditions, this vanishes as T — co. Most straightforwardly, all we require 
is that 


lim f de (1 — 1) RG) =0 (3.7.10) 
tua ge 7 " 


which is a little obscure. However, it is clear that a sufficient condition for this 
limit to be zero is for 


{dt |R@)| <0, ‘. (3.7.11) 


in which case, we simply require that the correlation function <x(t,), x(t,)> should 
tend to zero sufficiently rapidly as |t, —t,| —- co. In cases of interest it is frequently 
found that the asymptotic behavior of R(t) is 


R(t) ~ Re {A exp (—1/t,)} , (3.7.12) 


where 1, is a (possibly complex) parameter known as the correlation time. Clearly 
the criterion of (3.7.11) is satisfied, and we find in this case that the variance in X(T) 
approaches zero so that using (3.7.6) and (2.9.4), we may write 


ms-lim X(T) = <x)s. (3.7.13) 


This means that the averaging procedure (3.7.5) is indeed valid. It is not difficult to 
extend the result to an average of an infinite set of measurements at discrete times 
t, = to +nAt. 

Other ergodic hypotheses can easily be stated, and the two quantities that are of 
most interest are the autocorrelation function and the distribution function. 
As already mentioned in Sect. 1.4.2, the most natural way of measuring an auto- 
correlation function is through the definition 
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] T 
G(t, T) = a J dt x(t)x(t + 7) (3.7.14) 


and we can rather easily carry through similar reasoning to show that 
ms-lim G(t, T) = <x(t)x(t + t),, (3.7.15) 
T— 00 
provided the following condition Is satisfied. Namely, define p(t, A) by 


Cx(t + A+ t)x(t + A)x(t + T)x(t)>, = p(t, A) + Cx(t + 1)x(t)>?. (3.7.16) 


Then we require 
g 


lim = aa (1 — LN) ate, Nai= 0: (3.7.17) 


We can see that this means that for sufficiently large A, the four-time average 
(3.7.16) factorises into a product of two-time averages, and that the “error term”’ 
p(t, A) must vanish sufficiently rapidly for A —- co. Exponential behaviour, such 
as given in (3.7.12) is sufficient, and usually found. 

We similarly find that the spectrum, given by the Fourier transform 


S(w) = 5 [ e™™G(z)dt (3.7.18) 
as in Sect. 1.4, is also given by the procedure 
2 cee 
S(o) = lim <p f dre x(t) | (3.7.19) 


Finally, the practical method of measuring the distribution function 1s to con- 
sider an interval (x,, x,) and measure x(t) repeatedly to determine whether it is in 


this range or not. This gives a measure of f dx p,(x). Essentially, we are then meas- 


x1 
uring the time average value of the function y(x) defined by 


x(x) = 1 XX =X (3.7.20) 


= 0 otherwise, 


and we adapt the method of proving the ergodicity of <x> to find that the distri- 
bution Is ergodic provided 


lim sh if az (1 — ~ FI j dx'p,(x') { f de Lp(a, t|x’, 0) — p,(x)]} = 0. (3.7.21) 


The most obvious sufficient condition here is that 
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lim p(x, tT| x’, 0) = p,(x) (3.7.22) 
and that this limit is approached sufficiently rapidly. In practice, an exponential 
approach is frequently found and this is, as in the case of the mean, quite sufficiently 
rapid. 

This condition is, in fact, sufficient for ergodicity of the mean and autocorrela- 
tion function for a Markov process, since all means can be expressed in terms of 
conditional probabilities and the sufficiently rapid achievement of the limit (3.7.22) 
can be readily seen to be sufficient to guarantee both (3.7.17) and (3.7.10). We will 
call a Markov process simply “‘ergodic’’ if this rather strong condition Is satisfied. 


3.7.2 Homogeneous Processes 


If the condition (3.7.22) is satisfied for a stationary Markov process, then we clearly 
have a way of constructing from the stationary Markov process a nonstationary 
process whose limit as time becomes large is the stationary process. We simply 
define the process for 


t, t' > to (3.7.23) 
by 

P(x, t) = p(X, t|Xo, to) and (3.7.24) 

p(x, t|x’, t') = p,(x, t|x’, t’) { (3.7.25) 


and all other joint probabilities are obtained from these in the sual manner for a 
Markov process. Clearly, if (3.7.22) is satisfied, we find that as t — co orast, — — o, 


P(x, t)— p,(x) 


and all other probabilities become stationary because the conditional probability 
is stationary. Such a process is known as a homogeneous process. 

The physical interpretation is rather obvious. We have a stochastic system 
whose variable x is by some external agency fixed to have a value x, at time fo. It 
then evolves back to a stationary system with the passage of time. This is how many 
stationary systems are created in practice. 

From the point of view of the differential Chapman-Kolmogorov equation, we 
will find that the stationary distribution function p,(x) is a solution of the stationary 
differential Chapman-Kolmogorov equation, which takes the form 


0= =e i [A,(z)p(z, tly, t’/)] + + 2 sa [B,,(z)p(z, tly, t’)] 
+ f dx [W(z|x)p(x, tly, t') — W(x|z)p(z, tly, ¢)), (3.7.26) 


where we have used the fact that the process is homogeneous to note that A, B 
and W, as defined in (3.4.1—3), are independent of ¢. This is an alternative definition 
of a homogeneous process. 
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3.7.3 Approach to a Stationary Process 


A converse problem also exists. Suppose A, B and W are independent of time and 
p;(z) satisfies (3.7.26). Under what conditions does a solution of the differential 
Chapman-Kolmogorov equation approach the stationary solution p,(z)? 

There does not appear to be a complete answer to this problem. However, we 
can give a reasonably good picture as follows. We define a Lyapunov functional 
K of any two solutions p, and p, of the differential Chapman-Kolgomorov equation 
by 


K = J dx p,(x, t) log [pi(x, t)/pa(x, t)] (3.7.27) 


and assume for the mament that neither p, nor p, are zero anywhere. We will now 
show that K is always positive and dK/dt is always negative. 
Firstly, noting that both p(x, t) and p,(x, t) are normalised to one, we write 


K[p,, Pr, t] = f dx p(x, t) {log [p,(x, t)/p.(x, t)] 
+ p(x, t)/p,(x, t) — 1} (3.7.28) 
and use the inequality valid for all z > 0, 
—logz+z—1320, (3.7.29) 


to show that K > 0. 
Let us now show that dK/dt < 0. We can write (using an abbreviated notation) 


dK 7) 7) 
G, = Jax] P* flog p, + 1 — log pol — £2 [pilpal. (3.7.30) 


We now calculate one by one the contribution to dK/dt from drift, diffusion, and 
jump terms in the differential Chapman-Kolmogorov equation: 


d 


+ (p,/P2) a (4,3) (3.7.31) 


which can be rearranged to give 
dK 0 
(=) _ = x f dx = [— A,p, log (p,/p2)). (3.7.32) 
Similarly, we may calculate 
dK l 0? 
(ai) ane =~ HL A flow (Pea) + Nga, (Bud 


~ (p\p.) ce (Bp. (3.7.33) 
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and after some rearranging we may write 
dK ] 7) 0 
(Te) ane = Ty SAF Pou [aye low (ilps |5—- flog (p/p) 


+5 Ef de 25 |p By los (pi/ps) |. (3.7.34) 


XOX; 


Finally, we may calculate the jump contribution similarly: 


(r) nee [dxax’ I W(x|x)pi(x’, t) — W(x" |x), 0] 


x {log [pi(x, t)/p2(x, 0)) + 1} 
— [W(x | x’)p.(x’, t) — W(x" | x)p.(x, t)] pi (x, t)/p2(x, t) (3.7.35) 


and after some rearrangement, 


(ar)... = 1 dede’ Wels) (Pla, O16 log 9/6] — 8+ 8}, G.7.36) 
where 
d = p,(x, t)/p2(x, t) (3.7.37) 


and ¢’ is similarly defined in terms of x! 

We now consider the simplest case. Suppose a stationary Solution p,(x) exists 
which is nonzero everywhere, except at infinity, where it and its first derivative 
vanish. Then we may choose p,(x, t) = p,(x). The contributions to dK/dt from 
(3.7.32) and the second term in (3.7.34) can be integrated to give surface terms 
which vanish at infinity so that we find 


(a). _=0 (3.7.38a) 

(G).. <9 (3.7.38b) 

(= <0, (3.7.38c) 
jump 


where the last inequality comes by setting z = ¢’¢’ in (3.7.29). 

We must now consider under what situations the equalities in (3.7.38) are ac- 
tually achieved. Inspection of (3.7.36) shows that this term will be zero if and only 
if 6 = g’ for almost all x and x’ which are such that W(x|x’) # 0. Thus, if 
W(x |x’) is never zero, i.e., if transitions can take place in both directions between 
any pair of states, the vanishing of the jump contribution implies that (x) is 
independent of x, i.e., 
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Pi(x, t)/p.(x) = constant. (3.7.39) 


The constant must equal one since p,(x, t) and p,(x) are both normalised. 

The term arising from diffusion will be strictly negative if B,, is almost every- 
where positive definite. Hence, we have now shown that under rather strong condi- 
tions, namely, 


p;s(x) # 0 with probability 1 
W(x|x’) # 0 with probability 1, (3.7.40) 
B,,(x) positive definite with probability 1, 


that any solution of the differential Chapman-Kolmogorov equation approaches 
the stationary solution p,(x) at t — oo. 
The result fails in two basic kinds of systems. 


a) Disconnected State Space 

The result is best illustrated when A, and B,, vanish, so we have a pure jump system. 
Suppose the space divides into two regions R, and R, such that transitions from R, 
to R, and back are impossible; hence, W(x|x’) = 0 if x and x’ are not both in 
R, or R,. Then it 1s possible to have dK/dt = 0 if 


p(x, t) = A, p,(x) xER, (3.7.41) 
== A2p;(X) xe R, 


so that there is no unique Stationary distribution. The two regions are disconnected 
and separate stochastic processes take place in each, and in each of these, there is a 
unique stationary solution. The relative probability of being R, or R, is not changed 
by the process. 

A similar result holds, in general, if as well we have B,, and A, vanishing 
on the boundary between R, and R,. 


b) p, (x) Vanishes in Some Definite Region 
If we have 


p(xs)=0 xER (3.7.42) 
= Q xe R, 


and again A, and B,, vanish, then it follows that, since p,(x) satisfies the stationary 
equation (3.7.26), 


W(x|y) = 0 xe Ry E Ry. (3.7.43) 


In other words, no transitions are possible from the region R, where the stationary 
distribution is positive to R,, where the stationary distribution vanishes. 
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3.7.4 Autocorrelation Function for Markov Processes 


For any Markov process, we can write a very elegant formula for the autocorrela- 
tion function. We define 


(X(t) | [Xo; to] ios dx x P(x, t|Xo, ty) ) (3.7.44) 
then the autocorrelation matrix 


(X(t)X(to) > = f dx dxq xxi p(X, t; Xo, to) (3.7.45) 
ae AX, (X(t) |[X0, fo]>XaP(Xo: fo) - (3.7.46) 


Thus we see that (3.7.44) defines the mean of X(t) under the condition that X had 
the valie X, at time fo, and (3.7.46) tells us that the autocorrelation matrix is ob- 
tained by averaging this conditional average (multiplied by xf) at time ft). These 
results are true by definition for any stochastic process. 

In a Markov process we have, however, a unique conditional probability which 
determines the whole process. Thus, for a Markov process, we can state that 
(X(t)|[xo, fo]> iS a uniquely defined quantity, since the knowledge of x, at time fy 
completely determines the future of the process. The most notable use of this 
property is in the computation of the stationary autocorrelation function. To 
illustrate how this uniqueness is important, let us consider a non-Markov stationary 
process with joint probabilities 


Ps(X1, C1; X2, t25 --- Xn» tn); a (3.7.47) 


which, of course, depend only on time differences. Let us now create a correspond- 
ing nonstationary process by selecting only sample paths which pass through the 
point x = a at time t = 0. Thus, we define 


PAX, ty > X2, fo; -02 Xn, tn) oe p(*1, hh, x2, to; “es Xn» t,|@, 0). (3.7.48) 
Then for this process we note that 
(X(t) | [xo, tol). a f dx x p(x, t|Xo, to; a, 0) (3.7.49) 


which contains a dependence on a@ symbolised by the subscript a on the average 
bracket. If the original stationary process possesses appropriate ergodic properties, 
then 


lim p,(x, t + t|Xo, to + 7; @,0) = p,(x, t — to| Xo, 0) (3.7.50) 


40 that we will also have a stationary conditional average of x 


(X(t)|[¥o, to], = lim (X(t + 2) | lo, fo + *D. (3.7.51) 


and the stationary autocorrelation matrix is given by 
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(X(t) X(t), = f AX, X§< X(t) | [Xo, to] sPs(Xo) (3.7.52) 
= lim (X(t + 1t)X(t) + tT)”, 
me lim f AX, X}<X(t-+ T)|[Xo, to + T)>aPalXo, to + T). (3.7.53) 


However, when the process is Markovian, this cumbersome limiting procedure is 
not necessary since 


Markov —> ¢X(t)|[%o, told, = (X(t) [[% 0. tol 
= (X(t)|[Xo, to) - (3.7.54) 


Equation (3.7.46) is 4 regression theorem when applied to a Markov process and is 
the basis of a more powerful regression theorem for /inear systems. By this we mean 
systems such that a linear equation of motion exists for the means, i.e., 


dCX(t)|[%o, tol>/dt = —ACX(t)|[%o; tol) (3.7.55) 


which ts very often the case in systems of practical interest, either as an exact result 
or as an approximation. The initial conditions for (3.7.55) are clearly 


(X(t) |[X0, to] = Xo - (3.7.56) 
Then from (3.7.50, 59) 


© (X(OX (to)? = —ACKOX (6) (3.7.57) 


with initial conditions (X(t,.)X(t))™>. The time correlation matrix 


X(t) X(to)™> — (X(t) <X(to) > = <X(t), X(t) (3.7.58) 


obviously obeys the same equation, with the initial condition given by the covari- 
ance matrix at time fo. In a stationary system, we have the result that if G(t) is the 
stationary time correlation function and o the stationary covariance matrix, then 


dG(t)/dt = —A G(t) (3.7.59) 
and 

G(0) =o (3.7.60) 
or 

G(t) = exp [—At]o (3.7.61) 


which is the regression theorem in its simplest form. We again stress that it is valid 
for the Markov processes in which the mean values obey /inear evolution equations 
like (3.7.55). 
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For non-Markov processes there is no simple procedure. We must carry out the 
complicated procedure implicit in (3.7.53). 


3.8 Examples of Markov Processes 


We present here for reference some fundamental solutions of certain cases of the 
differential Chapman-Kolmogorov equation. These will have a wide application 
throughout the remainder of this book. 


3.8.1 The Wiener Process 
This takes its name from N. Wiener who studied it extensively. From the point of 
view of this chapter, it is the solution of the Fokker-Planck equation as discussed 


in Sect.3.5.2, in which there is only one variable W(t), the drift coefficient is zero and 
the diffusion coefficient is 1. Thus, the Fokker-Planck equation for this case is 


0 1 0? 
5; Pw, t | Wo, to) me FD, 2 PCM, t| Wo, to) : (3.8.1) 


Utilising the initial condition 
P(W, to| Wo, to) = 5(w — Wo) 25 (3.8.2) 
on the conditional probability, we solve (3.8.1) by use of the characteristic function 
d(s, t) = f dw p(w, t|Wo, to) exp (isw) (3.8.3) 


which satisfies 


cf = —}5*¢ (3.8.4) 
so that 
ds, t) = exp |- ee t)| b5 0): (3.8.5) 


From (3.8.2), the initial condition is 
G(S, to) = exp (iswo) 


so that 


G(s, t) = exp | iswe —4s5"(t — 1). (3.8.6) 
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Performing the Fourier inversion, we have the solution to (3.8.1): 
D(W, t| Wo, to) = [2n(t — to)]-'!? exp [—(w — wo)*/2(t — to)] - (3.8.7) 

This represents a Gaussian, with 
<W(t)> = Wo 


(W(t) — wo) =t— to, 


(3.8.8) 
(3.8.9) 


so that an initially sharp distribution spreads in time, as graphed in Fig.3.4. 


F 
Fig. 3.4. Wiener process: spreading of an 
/ \ initially sharp distribution p(w, t|Wo, to) 


with increasing time t — fo 
Wo 


p(w,t | Wo: to) 


A multivariate Wiener process can be defined as 

W(t) = [W,(t), W(t), ..., W,(t)] (3.8.10) 
which satisfies the multivariable Fokker-Planck equation 

0 I 0? 

57 PAWs Wo» to) = > De Fyn Pw t |W, to) (3.8.11) 


whose solution is 


PW, t|Wo, to) = [2n(t — t))])-"? exp [— (w — wo)*/2(t — to), (3.8.12) 
a multivariate Gaussian with 

<Wt)> = w, (3.8.13) 
and 


(Wit) — Worl [Wi(t) — wWoj]> = (t — to)d;, - (3.8.14) 
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The one-variable Wiener process is often simply called Brownian motion, since the 
Wiener process equation (3.8.1) is exactly the same as the differential equation of 
diffusion, shown by Einstein to be obeyed by Brownian motion, as we noted in 
Sect. 1.2. The terminology is, however, not universal. 

Points of note concerning the Wiener process are: 


a) Irregularity of Sample Paths 

Although the mean value of W(t) is zero, the mean square becomes infinite as t — 
co. This means that the sample paths of W(t) are very variable, indeed surprisingly 
so. In Fig. 3.5, we have given a few different sample paths with the same initial point 
to illustrate the extreme non-reproducibility of the paths. 


b) Non-differentiability of Sample Paths 
The Wiener process is a diffusion process and hence the sample paths of W(t) are 
continuous. However, they are not differentiable. Consider 


Prob{|[W(t + h) — W(t)|/h| > kh. (3.8.15) 


From the solution for the conditional probability, this probability ts 
2 f dw(2nh)-'/exp (—w?/2h) (3.8.16) 
kh 


and in the limit as A — 0 this is one. This means that no matter what value of k 
choose, |[W(t + h) — W(t)]/h| is ahmost certain to be greater than this, i.e., the 
derivative at any point is almost certainly infinite. This is in agreement with the 
similar intuitive picture presented in Sect.3.5.2 and the simulated paths given in 
Fig. 3.5 illustrate in point dramatically. This corresponds, of course, to the well- 
known experimental fact that the Brownian particles have an exceedingly irregular 
motion. However, this clearly an idealisation, since if W(t) represents the position 


Fig. 3.5. Three simulated sample paths of the Wiener process, illustrating their great variability 
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of the Brownian particle, this means that its speed 1s almost certainly infinite. The 
Ornstein-Uhlenbeck process is a more realistic model of Brownian motion (Sect. 
3.8.4). 


c) Independence of Increment 

The Wiener process is fundamental to the study of diffusion processes, and by 
means of stochastic differential equations, we can express any diffusion process in 
terms of the Wiener process. 

Of particular importance is the statistical independence of the increments of 
W(t). More precisely, since the Wiener process is a Markov process, the joint proba- 
bility density can be written 

P(Way tn Wnts Ents Wn—25 En—23 +» 3 Woy Co) 


n—1 
a I] PWr41s tear | We ti) P(Wos fo) » (3.8.17) 
and using the explicit form of the conditional probabilities (3.8.7), we see that 


P(r; Ln Wn-1> bn—15 Wn-25 bn_-2; see Wo, to) 


al 
oa It {(2r(ti41. — t))-'? exp [—(Wiar — W)?/2(tia — t,)}} P(Wos to). (3.8.18) 
If we define the variables 


AW, = W(t;) Ts W(t:_1) (3.8.19) 
At, = t, — ti-y, (3.8.20) 
then the joint probability density for the AW, is 


p(Aw,; Aw,_1; AW,_2; -.. AW13 Wo) 
= J] {(2nAt)-"/? exp (—Aw?/2At,} p(Wos to) (3.8.21) 
i=1 


which shows from the definition of statistical independence given in Sect.2.3.4, 
that the variables AW, are independent of each other and of W(t,). 

The aspect of having independent increments AW, is very important in the 
definition of stochastic integration which is carried out in Sect. 4.2. 


d) Autocorrelation Functions 
A quantity of great interest is the autocorrelation function, already discussed in 
Sects. 1.4.2 and 3.7.4. The formal definition is 


(Wit) W(s) | [Wo, to) = dw,dw, Wi W2pP(W1, t; W25 s| Wos to), (3.8.22) 


which is the mean product of W(t) and W(s) on the condition that the initial value is 
W(t.) = Wo, and we can see, assuming ¢t > s, that 
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(W(t)W(s)|[Wo, tol> = (Wt) — W(s)]W(s)> + <[W(s)]> . (3.8.23) 


Using the independence of increments, the first average is zero and the second is 
given by (3.8.9) so that we have, in general, 


(W(t) W(s)|[Wo, to]> = min(t — fo, 5 — to) + we (3.8.24) 


which is correct for tf > sandt<s. 


3.8.2 The Random Walk in One Dimension 


This is a very famous problem, which 1s now considered classical. A man moves 
along a line, taking, at random, steps to the left or the right with equal probability. 
The steps are of length / so that his position can take on only the value m/, where n 
is integral. We want to know the probability that he reaches a given point a 
distance n/ from the origin after a given elapsed time. 

The problem can be defined in two ways. The first, which is more traditional, is 
to allow the walker to take steps at times Nt (N integral) at which times he must 
step either left or right, with equal probability. The second is to allow the walker to 
take steps left or right with a probability per unit time d which means that the 
walker waits at each point for a variable time. The second method is describable 
by a Master equation. ‘¥ 

To do a Master equation treatment of the problem, we consider that the transi- 
tion probability per unit time is given by the form 


Wn + 1|n,t) = Wn — 1|n, t) = d; (3.8.25) 


otherwise, W(n|m, t) = 0 so that, according to Sect.3.5.1, the Master equation 
for the man to be at the position vl, given that he started at n’J, is 
0,P(n, t|n’, t') = d[P(n + 1, t|n’, t'!) + P(t — 1, t|n’, 0’) 
— 2P(n, t|n't’)]. (3.8.26) 
The more classical form of the random walk does not assume that the man makes 
his jump to the left or right according to a Master equaton, but that he jumps left 
or right with equal probability at times Nz, so that time is a discrete variable. In 
this case, we can write 
P(n, (N + 1)t|n’, N’t) = 4 [P+ 1, Nt|n’, N’t) 
+ P(n — 1, Nt|n’, N’t)) . (3.8.27) 


If t is small, we can view (3.8.26, 27) as approximations to each other by writing 


P(n, (N + 1)t|n', N’t) = P(n, Nt|n', N't) + 10,P(n, t|n’, t’') (3.8.28) 
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with ¢ = Nt, t' = N't and d= 4t"', so that the transition probability per unit 
time in the Master equation model corresponds to half of the inverse waiting time 


Tt in the discrete time model. 
Both systems can be easily solved by introducing the characteristic function 


GCs) =<"): = p> P(n, t\n’, t’)ei™ (3.8.29) 
in which case the Master equation gives 

0,G(s, t) = d(e* + e7** — 2)G(, t) (3.8.30) 
and the discrete time equation becomes 

G(s, (N+ 1)t) = 4(e¥* + e7*)G(s, Nt). (3.8.31) 

Assuming the man starts at the origin n’ = 0 at time t’ = 0, we find 

G(s, 0) = 1 (3.8.32) 
in both cases, so that the solution to (3.8.30) Is 

G,(s, t) = exp [(e* + e~* — 2)td], (3.8.33) 
and to (3.8.31) 

G(s, Nt) = [4(e* +e7*)]* (3.8.34) 


which can be written 
dt i =i - 
G(s, t) =|1 + W (e'*¥ + e's — 2)] . (3.8.35) 
Using the usual exponential limit 


a\% 
lim (1 + 7d =2:67. (3.8.36) 


N-co 


we see that, provided s is sufficently small 


lim G,(s, t) = G,(s, t) (3.8.37) 


which, by the properties of the characteristic function (v) in Sect.2.6, means the 
probability distributions approach each other. 

The appropriate probability distributions can be obtained by expanding G,(s, Nt) 
and G,(s, t) in powers of exp (is); we find 


P,(n, t|0, 0) = e7?47,(4td) (3.8.38) 
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N — N = 
P,(n, Nz|0, 0) = (4)*N! eee ! ("| ] | (3.8.39) 
The discrete time distribution is also known as the Bernoulli distribution; it gives 
the probability of a total of n heads in tossing an unbiased coin N times. 

The limit of continuous space is also of interest. If we set the distance travelled 
as 


x=nl (3.8.40) 
so that the characteristic function of the distribution of x is 

b,(s, t) = <e'**>) = G,(Is, t) exp[(e* + e7S — 2)td] . (3.8.41) 
then the limit of infinitesimally small steps /— 0 is 

di(s, t) — exp (—s*tD), (3.8.42) 


where D = lim (/?d). (3.8.43) 
i~0 


This is the characteristic function of a Gaussian (Sect.2.8.1) of the form 
p(x, t|0, 0) = (4nDt)-'/? exp (—x?/4Dt) (3.8.44) 
¢ 


and is of course the distribution for the Wiener process (Sect.3.8.1) or Brownian 

motion, as mentioned in Sect.1.2. Thus, the Wiener process can be regarded as the 

limit of a continuous time random walk in the limit of infinitesimally small step size. 
The limit 


1>0,170, with D = lim (/?/t) (3.8.45) 
I-0 


of the discrete time random walk gives the same result. From this form, we see 
clearly the expression of D as the mean square distance travelled per unit time. 

We can also see more directly that expanding the right-hand side of (3.8.26) 
as a function of x up to second order in / gives 


a, p(x, t]0, 0) = (/2d)a2p(x, t|0, 0) . (3.8.46) 


The three processes are thus intimately connected with each other at two levels, 
namely, under the limits considered, the stochastic equations approach each other 
and under those same limits, the solutions to these equations approach each 
other. These limits are exactly those used by Einstein. Comparison with Sect.1.2 
shows that he modelled Brownian motion by a discrete time and space random 
walk, but nevertheless, derived the Wiener process model by expanding the equa- 
tions for time development of the distribution function. 
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The limit results of this section are a slightly more rigorous version of Einstein’s 
method. There are generalisations of these results to less specialised situations and it 
is a fair statement to make that almost any jump process has some kind of limit 
which is a diffusion process. However, the precise limits are not always so simple, 
and there are limits in which certain jump processes become deterministic and 
are governed by Liouville’s equation (Sect.3.5.3) rather than the full Fokker-Planck 
equation. These results are presented in Sect.7.2. 


3.8.3 Poisson Process 


We have already noted the Poisson process in Sect.1.4.1. The process in which 
electrons arrive at an anode or customers arrive at a shop with a probability per 
unit time d of arriving, is governed by the Master equation for which 


Win + I[n, t)=d; (3.8.47) 
otherwise, 
W(n|m, t)=0. (3.8.48) 


This Master equation becomes 
0,P(n, t|n', t') = d[P(n — 1, t|n’, t’) — P(n, t|n’, t/)] (3.8.49) 


and by comparison with (3.8.26) also represents a “one-sided” random walk, in 
which the walker steps to the right only with probability per unit time equal to d. 
The characteristic function equation is similar to (3.8.30): 


0,G(s, t) = d[exp (is) — 1]G(s, t) (3.8.50) 
with the solution 
G(s, t) = exp {td[exp (is) — 1] (3.8.51) 


for the initial condition that there are initially no customers (or electrons) at time 
t = 0, yielding 


P(n, t|0, 0) = exp (— td)(td)"/n!, (3.8.52) 
a Poisson distribution with mean given by 

<N(t)> = td. (3.8.53) 
In contrast to the random walk, the only limit that exists is / —- 0, with 


dl=v (3.8.54) 
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held fixed and the limiting characteristic function is 
lim {exp [td(e'’* — 1)]} = exp (itvs) (3.8.55) 
l-0 


with the solution 
p(x, t|0, 0) = d(x — vt). (3.8.56) 


We also see that in this limit, we would obtain from the Master equation (3.8.49) 
Liouville’s equation, whose solution would be the deterministic motion we have 
derived. 

We can do a slightly more refined analysis. We expand the characteristic func- 
tion up to second order in s in the exponent and find 


o(s, t) = G(Is, t) = exp [t(ivs — s*D/2)] (3.8.57) 
where, as in the previous section, 
D=Fd. 


This is the characteristic function of a Gaussian with variance Dt and mean vt, 
so that we now have 


p(x, t|0, 0) = (QnDt)~'? exp [— (x — vt)*/2Dt]. (3.8.58) 
It is also clear that this solution is the solution of 

a, p(x, t|0, 0) = —v d,p(x, t|0,0) + 4D dp(x, t]0,0 (3.8.59) 
which is obtained by expanding the Master equation (3.8.49) to order /*, by writing 


P(n — 1, t|0, 0) = d p(x — J, t|0, 0) 
= d p(x, t|0,0) — Idd, p(x, t|0,0) + 4/?%d02 p(x, t|0,0). (3.8.60) 


However, this is an approximation or an expansion and not a limit. The limit / —- 0 
gives Liouville’s equation with the purely deterministic solution (3.8.56). Effectively, 
the limit /— 0 with well-defined v corresponds to D = 0. The kind of approxi- 
mation just mentioned is a special case of van Kampen’s system size expansion 
which we treat fully in Sect.7.2.3. 


3.8.4 The Ornstein-Uhlenbeck Process 


All the examples so far have had no stationary distribution, that is, as t — oo, the 
distribution at any finite point approaches zero and we see that, with probability 
one, the point moves to infinity. 

If we add a linear drift term to the Wiener process, we have a Fokker-Planck 
equation of the form 


0,p = 0,(kxp) + 4D ozp , (3.8.61) 
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where by p we mean p(x, t| Xo, 0). This is the Ornstein-Uhlenbeck process [3.5]. The 
equation for the characteristic function 


b(s) = [ p(x, t] x9, 0)dx is (3.8.62) 
0.6 + ksd,6 = — } Ds*d. (3.8.63) 
The method of characteristics can be used to solve this equation, namely, if 


u(s, t, 6) =a and us, t, d) = 5b (3.8.64) 


are two integrals of the subsidary equation (with a and b arbitrary const) 


dt _ds_ ap 
1 ks = 4Ds?g’ 


(3.8.65) 


then a general solution of (3.8.63) is given by 


fu, v) = 0. 


The particular integrals are readily found by integrating the equation involving dt 
and ds and that involving ds and dg; they are 


u(s, t, 6) = sexp (— kt) and (3.8.66) 


v(s, t, 6) = ¢ exp (Ds*/4k) , (3.8.67) 


and the general solution can clearly be put in the form v = g(u) with g(u) an arbi- 
trary function of u. Thus, the general solution 1s 


d(s, t) = exp (—Ds*/4k)g[s exp (-—kt)] (3.8.68) 
The boundary condition 

P(x, 0| Xo, 0) = 8(x — Xo) (3.8.69) 
clearly requires 

d(s, 0) = exp (ix) (3.8.70) 
and gives 

g(s) = exp (Ds?/4k + ixgs) , 


and hence, 


— Ds’ —2k : —ke 
o(s, t) = exp aie (1 — e-7**) + isxoe (3.8.71) 
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which, from Sect.2.8.1, corresponds to a Gaussian with 


(X(t)) = Xo exp (—Kt) (3.8.72) 
var (X(1)} = 52 [1 — exp (—2kt)] (3.8.73) 


Clearly, as t —- co, the mean and variance approach limits 0 and D/2k, respec- 
tively, which gives a limiting stationary solution. This solution can also be obtained 
directly by requiring 0,p = 0, so that p satisfies the stationary Fokker-Planck equa- 
tion 


a,| kxp i $D9.p) = (3.8.74) 


and integrating once, we find 


| kxp m0 $D9,p) 22), (3.8.75) 


x 
—oo 


The requirement that p vanish at —oo together with its derivative, is necessary for 
normalisation. Hence, we have 


—3,p = -—— 3.8.76 
pe D ? (3.8.76) 


sothat p,(x) = (nD/k)~'"* exp (—kx?/D) ie (3.8.77) 


which is a Gaussian with mean 0 and variance D/2k, as predicted from the time- 
dependent solution. 

It is clear that a stationary solution can always be obtained for a one variable 
system by this integration method if such a stationary solution exists. If a stationary 
solution does not exist, this method gives an unnormalisable solution. 


Time Correlation Functions for the Ornstein-Uhlenbeck Process. The time correla- 
tion function analogous to that mentioned in connection with the Wiener process 
can be calculated and is a measurable piece of data in most stochastic systems. 
However, we have no easy way of computing it other than by definition 


(X(t)X(s)|[Xos to]> = ff dxidxe x1X2 p(X1, £3 X2, 5| Xo, to) (3.8.78) 
and using the Markov property 

= JJ dxydx, x12 pO, t| 2, 5)pO%2, 5| Xo, to) (3.8.79) 
on the assumption that 


t]>sVto. (3.8.80) 
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The correlation function with a definite initial condition is not normally of as 
much interest as the stationary correlation function, which is obtained by allowing the 
system to approach the stationary distribution. It is achieved by putting the initial 


condition in the remote past, as pointed out in Sect. 3.7.2. Letting tj — —oo, we 
find 


lim p(X2, 5| Xo, to) = Ps(%2) = (WD/k)~'” exp (— kx}/D) . (3.8.81) 
t{g7—© 


and by straightforward substitution and integration and noting that the stationary 
mean Is zero, we get 


(X(t) X(5)). = (X(t), X(5)e = ex Rise. (3.8.82) 


This result demonstrates the general property of stationary processes: that the 
correlation functions depend only on time differences. It is also a general result 
[3.6] that the process we have described in this section 1s the only stationary Gaus- 
sian Markov process in one real variable. 

The results of this subsection are very easily obtained by the stochastic differ- 
ential equation methods which will be developed in Chap.4. 

The Ornstein-Uhlenbeck process is a simple, explicitly representable process, 
which has a stationary solution. In its stationary state, it is often used to model a 
realistic noise signal, in which X(t) and X(s) are only significantly correlated if 


|l¢—s| ~If/k=t. (3.8.83) 


(More precisely, t, known as the correlation time can be defined for arbitrary 
processes X(s) by 


t= f dt (X(t), X))/var{X}, (3.8.84) 


which is independent of the precise functional form of the correlation function). 


3.8.5 Random Telegraph Process 


We consider a signal X(t) which can have either of two values a and b and switches 
from one to the other with certain probabilities per unit time. Thus, we have a 
Master equation 


0,P(a, t| x, to) = —AP(a, t|x, to) ar uP(6, t| x, to) 


(3.8.85) 
0,P(b, t| x, to) — AP(a, t|x, to) _ uUP(5, t|x, to) 


for which the solution can simply be found by noting that 


P(a, t|x, to) “ag P(b, t| x, to) = I 
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and that a simple equation can be derived for AP(a, t| x, to) — uP(b, t| x, to), whose 
solution is, because of the initial condition 


P(x’, to| x, to) rae Ox, x! ) (3.8.86) 


AP(a, t|X, to) — UP(b, t| x, to) = exp[—C + p(t — to)](A8,,. — HO.) (3.8.87) 


so that 
Pla, tx, te) = poo + exp [G+ w(t ~ to) (Fa Sax — 7 bu 
3.8.88 
; ; (3.8.88) 
P(b, t1%, to) = Fa — expl—G+ w(t — tI (a ba — Ga One 


This process clearly has the stationary solution obtained by letting tg —- —co: 


7 
P,(a) = —— 
A+ 
: 7 (3.8.89) 
P,(b —— 
(0) = A+ yu 
which is, of course, obvious from the Master equation. 
The mean of X(t) and its variancé&are straightforwardly computed: 
(X(t) |[X0. tol» = da xP(X, t| Xo, to) 
2 bA 
= ETF ePl-A+ mle — 00) (vo BS") 3.8.90) 
so that 
Oe (3.8.91) 


BH+A S 


The variance can also be computed but is a very messy expression. The stationary 
variance is easily computed to be 


(a — b)'yA 


var {X}, = ae: 


(3.8.92) 


To compute the stationary time correlation function, we write (assuming ¢ > s) 
(X(t)X(s)>, = 2 xx P(x, t|x’, s)P,(x’) (3.8.93) 
= pe x' (X(t) | [x’, sp P(x’) . (3.8.94) 
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we now use (3.8.90-92) to obtain 


(X(t)X(6))e = (XD? + exp [—C + wt — KX, — C2) (3.8.95) 
= (BA) + expl-G + er - )}' Fo. 8.8.96) 

Hence, 
X(1)K())e = KOX))s — DE = exp [-A +0) t= sIFR, 3.8.97) 


Notice that this time correlation function is of exactly the same form as that of the 
Ornstein-Uhlenbeck process. Higher-order correlation functions are not the same 
of course, but because of this simple correlation function and the simplicity of the 
two state process, the random telegraph signal also finds wide application in 
model building. 


4. The Ito Calculus and Stochastic Differential Equations 


4.1 Motivation 


In Sect.1.2.2 we met for the first time the equation which 1s the prototype of what 
is now known as a Langevin equation, which can be described heuristically as an 
ordinary differential equation in which a rapidly and irregularly fluctuating random 
function of time [the term X(t) in Langevin’s original equation] occurs. The sim- 
plicity of Langevin’s derivation of Einstein’s results is in itself sufficient motivation 
to attempt to put the concept of such an equation on a reasonably precise footing. 

The simple-minded Langevin equation that turns up most often can be written 
in the form 


a = a(x, t) + d(x, t)eé(t), (4.1.1) 


where x is the variable of interest, a(x, t) and b(x, t) are certain known functions and 
E(t) is the rapidly fluctuating random term. An idealised mathematical formulation 
of the concept of a “rapidly varying, highly irregular functiom- is that for t # t’, E(t) 
and &(t’) are statistically independent. We also require <é(t)> = 0, since any non- 
zero mean can be absorbed into the definition of a(x, t), and thus require that 


Co(t)c(t')> = dt — £’) (4.1.2) 


which satisfies the requirement of no correlation at different times and furthermore, 
has the rather pathological result that ¢(¢) has infinite variance. From a realistic 
point of view, we know that no quantity can have such an infinite variance, but 
the concept of white noise as an idealisation of a realistic fluctuating signal does 
have some meaning, and has already been mentioned in Sect.1.4.2 in connection 
with Johnson noise in electrical circuits. We have already met two sources which 
might be considered realistic versions of almost uncorrelated noise, namely, the 
Ornstein-Uhlenbeck process and the random telegraph signal. For both of these 
the second-order correlation function can, up to a constant factor, be put in the 
form 


(X(t), X(t’)) = + erie el (4.1.3) 
Now the essential difference between these two is that the sample paths of the ran- 


dom telegraph signal are discontinuous, while those of the Ornstein-Uhlenbeck pro- 
cess are not. If (4.1.1) is to be regarded as a real differential equation, in which €(f) 


4.1 Motivation 8] 


is not white noise with a delta function correlation, but rather a noise with a finite 
correlation time, then the choice of a continuous function for &(t) seems essential 
to make this equation realistic: we do not expect dx/dt to change discontinuously. 
The limit as y —- Wof the correlation function (4.1.3) is clearly the Dirac delta func- 
tion since 


——, 8 


D g-rcrign’ = | (4.1.4) 


co 


and fort # ?’, 


lim 5 enter 0), (4.1.5) 


y—+00 


This means that a possible model of the &(t) could be obtained by taking some 
kind of limit as y—~ co of the Ornstein-Uhlenbeck process. This would corres- 
pond, in the notation of Sect. 3.8.4, to the limit 


kes (4.1.6) 


with D = k?’. This limit simply does not exist. Any such limit must clearly be taken 
after calculating measurable quantities. Such a procedure is possible but too 
cumbersome to use as a calculational tool. 

An alternative approach ts called for. Since we write the differential equation 
(4.1.1), we must expect it to be integrable and hence must expect that 


Hy j dt’ E(t’) (4.1.7) 


exists. 
Suppose we now demand the ordinary property of an integral, that u(t) is a con- 
tinuous function of t. This implies that u(t) is a Markov process since we can write 


ii f ds &(s) + { ds &(s) (4.1.8) 
= lim fas a) + fads Es) (4.1.9) 


and for any e > 0, the (s) in the first integral are independent of the &(s) in the 
second integral. Hence, by continuity, u(t) and u(t’) — u(t) are statistically indepen- 
dent and further, u(t’) — u(t) is independent of u(t”) for all t’ < t. This means 
that u(t’) is fully determined (probabilistically) from the knowledge of the value of 
u(t) and not by any past values. Hence, u(t) is a Markov process. 

Since the sample functions of u(t) are continuous, we must be able to describe 
u(t) by a Fokker-Planck equation. We can compute the drift and diffusion coef- 
ficients for this process by using the formulae of Sect.3.5.2. We can write 
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(u(t + At) ~ wolfu, > = Cf &(s)ds) = 0 (4.1.10) 
and 

<{u(t + At) — u]*|[uo, t]}) = vi ds "fds" ENED (4.1.11) 

= fds ‘Tass Sao hy (4.1.12) 


so that the drift and diffusion coefficients are 


A(up, t) = lim dt tO) — Hol ton 2 — 0 (4.1.13) 


B(uo t) — lim [u(t = At) an Ug} |[uo, t}> ee 


lim - 1. (4.1.14) 


Thus, the Fokker-Planck equation is that of the Wiener process and we can write 
{ €(t’)dt' = u(t) = Wit). (4.1.15) 
0 


Thus, we have the paradox that the integral of E(t) is W(t), which is itself not dif- 
ferentiable, as shown in Sect.3.8.1. THis means that mathematically speaking, the 
Langevin equation (4.1.1) does not exist. However, the corresponding integral 
equation 7 


x(t) — x(0) = f a[x(s), s]ds + f b[x(s), s]e(s)ds (4.1.16) 


can be interpreted consistently. 
We make the replacement, which follows directly from the interpretation of the 
integral of E(t) as the Wiener process W(t), that 


dW(t) = W(t + dt) — W(t) = E(t)dt (4.1.17) 


and thus write the second integral as 
{ B[x(s), s]ldW(s) , (4.1.18) 
0 


which is a kind of stochastic Stieltjes integral with respect to a sample function 
W(t). Such an integral can be defined and we will carry this out in the next section. 

Before doing so, it should be noted that the requirement that u(t) be continuous, 
while very natural, can be relaxed to yield a way of defining jump processes as 
stochastic differential equations. This has already been hinted at in the treatment of 
shot noise in Sect. 1.4.1. However, it does not seem to be nearly so useful and will 
not be treated in this book. The interested reader is referred to [4.1]. 
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As a final point, we should note that one normally assumes that &(t) is Gaus- 
sian, and satisfies the conditions (4.1.2) as well. The above did not require this: 
the Gaussian nature follows in fact from the assumed continuity of u(t). Which of 
these assumptions is made is, in a strict sense, a matter of taste. However, the 
continuity of u(t) seems a much more natural assumption to make than the Gaus- 
sian nature of &é(t), which involves in principle the determination of moments of 
arbitrarily high order. 


4.2 Stochastic Integration 


4.2.1 Definition of the Stochastic Integral 


Suppose G(t) is an arbitrary function of time and W(t) is the Wiener process. We 
define the stochastic integral ja aw(t') as a kind of Riemann-Stieltjes integral. 


Namely, we divide the interval [t, t] into m subintervals by means of partitioning 
points (as in Fig. 4.1) 


fe) 2 
tT % G Tn 


BPSithShaoe Shi St, (4.2.1) 
and define intermediate points t, such that 
bi it oh. (4.2.2) 


The stochastic integral f G(t')dW(t’) is defined as a limit of the partial sums. 
f0 
Ss, = ps Gt, )[W(t,) — Wt.) - (4.2.3) 


It is heuristically quite easy to see that, in general, the integral defined as the limit of 
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S,, depends on the particular choice of intermediate point t,. For, if we take the 


choice of G(t,) = W(t,), 


(S,) = (> WNW) — WD 


n 


>) [min(t,, ¢t,) — min(t,, t;_1)] 


i=] 


n 


>) (ti — t-1)- 


i=1 


If, for example, we choose for all i 


T; a at; +- (1 = a)t;_, (O <a<x 1), 


then ¢S,) = > (eee ae 


(4.2.4) 
(4.2.5) 


(4.2.6) 


(4.2.7) 


(4.2.8) 


So that the mean value of the integral can be anything between zero and (t — ft), 


depending on the choice of intermediate points. 


We therefore make the choice of intermediate points characterised by a = 0, 


that is, we choose 


T, = NL 


5 
and thus define the Ito stochastic integral of the function G(t) by 


” 


fG@)aw(r') = ms -tim {$ GE. WE) — Meo}. 


ad 


By ms-lim we mean the mean square limit, as defined in Sect.2.9.2. 


4.2.2 Example f W(t')d Wt’) 
to 
An exact calculation is possible. We write [writing W, for W(t,)] 
S, = ys W,.(W, — W,..) = > W,_, AW, 
=4 Win + AW — (Wa — (AW) 
= W(? — Wt)" — $1 (AW, 
We can calculate the mean square limit of the last term. Notice that 


Om (A W,,’» = pa (W; os W.-1)) _ ba (t, <= t,_1) = t — Io. 


Because of this, 


(4.2.9) 


(4.2.10) 


(4.2.12) 


(4.2.13) 


(4.2.14) 
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(0 (W = Wi)? = = to) 


= o2 (W, — W,\)* + 2 2 (W, — Wi.) (W; — Wy 


a a(t i=. to) Ds (W, : Wi1) oe (t -— to)’>. 
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(4.2.15) 


Notice the W, — W,_, is a Gaussian variable and is independent of W, — W,_,. 


Hence, we can factorise. Thus, 

(Wi — Wii)’ (Wy - Way) = (te — ti) (Gj — 4-1) 
and also [using (2.8.6)] 

(W, — Wray) = 3 (W, — Wis)? = te — toa) 
which combined with (4.2.16) gives 


(Lao (W; ae W-1) ~ (t o to)]> =2 a (t; _ ti)’ 


ae 2, ((t; — ti) — @ — to) MC — G1) — Ct — t0)] 


=2 » (t; — t,-1)° 
—Q as n—oo. 
Thus, 
ms-lim 2 (W,—- Wi1yY =t — to 


by definition of the mean square limit, so 
J We )dW(t') = W(t)? — Wt)? — (t — to) 
f0 


Comments 


) f WHdWNy =F WU)) — (M(t) — (t — t)] = 0 


(4.2.16) 


(4.2.17) 


(4.2.18) 


(4.2.19) 


(4.2.20) 


(4.2.21) 


This is also obvious by definition, since in the individual terms we have 
<W,_, AW,> which vanishes because AW, is statistically independent of W;_,, 


as was demonstrated in Sect.3.8.1. 


ii) The result for the integral is no longer the same as for the ordinary Riemann-Stielt- 
jes integral in which the term (t — f,) would be absent. The reason for this is that 
| W(t + At) — W(t)| is almost always of the order s/f, so that in contrast to or- 
dinary integration, terms of second order in AW(t) do not vanish on taking the 


limit. 
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4.2.3 The Stratonovich Integral 


An alternative definition was introduced by Stratonovich [4.2] as a stochastic integral 
in. which the anomalous term (f—%)) does not occur. We define this fully in 
Sect. 4.3.6 — in the cases considered so far, it amounts to evaluating the integrand 
as a function of W(r) at the value ; [W(t;) +W(tG_,)]. It is straightforward to show 
that 


S{w)dw(r) 


fi) 


= ms-lim y W (ti) a W (ti-1) 


ea 2 


[W(t)) -— W(t-1)] (4.2.22) 


[W(t)?-— W(to)?]. (4.2.23) 


us 
2 


However, the integral as defined by Stratonovich [which we will always designate 
by a prefixed S as in (4.2.24)] has no general relationship with that defined by Ito. 
That is, for arbitrary functions G(t), there is no connection between the two 
integrals. [In the case, however, where we can specify that G(t) is related to some 
stochastic differential equation, a formula can be given relating one to the other, 
see Sect.4.3.6]. 


4.2.4 Nonanticipating Functions 


The concept of a nonanticipating fupcticn can be easily made quite obscure by 
complex notation, but is really quite simple. We have in mind,a situation in which 
all functions can be expressed as functions or functionals of a certain Wiener 
process W(t) through the mechanism of a stochastic differential (or integral) 
equation of the form 


x(t) — x(to) = f a[x(t’), t’Jdt’ + f b[x(t’), t]dwct’). (4.2.24) 
to tg 

A function G(t) is called a nonanticipating function of t if for all s and t such that 
t < 5s, G(t) is statistically independent of W(s) — W(t). This means that G(f) is in- 
dependent of the behaviour of the Wiener process in the future of t. This is clearly 
a rather reasonable requirement for a physical function which could be a solution 
of an equation like (4.2.24) in which it seems heruistically obvious that x(t) involves 
W(t’) only for t’ < ¢. 

For example, specific nonanticipating functions of ¢ are: 


1) W(t) 
ii) f dr’ F[WC’)] 


iii) f dW(t') F(W(t’)) 
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iv) fat'G(’) | 
: when G(f) is itself a nonanticipating function 
vy) fdwrya(r) | 
sf) 


Results (iii) and (v) depend on the fact that the Ito stochastic integral, as defined 
in (4.2.10), is a limit of the sequence in which only G(t’) for t’ < t and W(t’) for 
t' < t are involved. 

The reasons for considering nonanticipating functions specifically are: 

i) many results can be derived, which are only true for such functions; 

ii) they occur naturally in situations, such as in the study of differential equations 
involving time, in which some kind of causality is expected in the sense that the 
unknown future cannot affect the present; 

iii) the definition of stochastic differential equations requires such functions. 


4.2.5 Proof that dW(t)? = dt and dW(t)*** = 0 


The formulae in the heading are the key to the use of the Ito calculus as an ordinary 
computational tool. However, as written they are not very precise and what is really 
meant 1s 


f (dwt) P**G(t’) = ms-lim 4) G,_,AW?*% 
to nus F 
=fadrt'G(')for N=0 (4.2.25) 
0 
=0 forN>0 


for an arbitrary nonanticipating function G(t). 
The proof is quite straightforward. For N = 0, let us define 


J = lim <(>) G;_\(AW? — At,)]?> (4.2.26) 
= lim ((3\(Gs1)(AW? — At)? +X 2G.1G,(AW} — At (AW? — At)y). 

n—00 i— YS i>f 
Independent Independent (4.2.27) 


The horizontal braces indicate factors which are statistically independent of each 
other because of the properties of the Wiener process, and because the G, are values 
of a nonanticipating function which are independent of all AW, for j > i. 

Using this independence, we can factorise the means, and also using 


1) <AW?) = At, 
ii) <(AW? — At,)*>= 2At? (from Gaussian nature of 4W,), 


we find 


I= 2 lim [X At?(Gi1)?)) - (4.2.28) 
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Under reasonably mild conditions on G(t) (e.g., boundedness), this means that 


ms-lim (3} G_yAW? — ¥ G,1At,) = 0 (4.2.29) 
and since 
ms-lim 32 G,_,At, = f dr’'Gtt’) , (4.2.30) 
noo to 
we have 


{(dwe)P Gt’) = far’G(r’). 


Comments 
i) The proof { G(t)[dW(t)?'** = 0 for N > 0 is similar and uses the explicit ex- 


pressions for the higher moments of a Gaussian (Sect.2.8.1). 


it) dW(t) only occurs in integrals so that when we restrict ourselves to nonantici- 
pating functions, we can simply write 


dwt) = dt (4.2.31) 

dw(tyr*% = O(N> 0). ? (4.2.32) 
iii) The results are only valid for the Ito integral, since we have used the fact that 
AW, is independent of G,_,. In the Stratonovich integral, 7 


AW, = Wt,) — Wti_1) (4.2.33) 
G,_, = G[h(t, + 4-1)] (4.2.34) 


and although G(f) is nonanticipating, this is not sufficient to guarantee the indepen- 
dence of AW, and G;_, as thus defined. 


iv) By similar methods one can prove that 
{ Gt)dt'dW(t’) = ms-lim 3} G,_, AW, At, = 0 (4.2.35) 
to n-oo 


and similarly for higher powers. The simplest way of characterising these results 
is to say that dW(t) is an infinitesimal order of 4 and that in calculating differ- 
entials, infinitesimals of order higher than 1 are discarded. 


4.2.6 Properties of the Ito Stochastic Integral 


a) Existence 
One can show that the Ito stochastic integral f G(t')dW(t') exists whenever the 


° . e . ° e ‘0 . 
function G(t’) is continuous and nonanticipating on the closed interval [fo, t] [4.3]. 
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b) Integration of Polynomials 
We can formally use the result of Sect.4.2.5: 


awh = (Wi) + awior — Woy = ("] WayraMny 


and using the fact that dW(t)’ —- 0 for all r > 2, 
—1 n(n mes 1) n—2 
= nW(t)""'dW(t) + a, a W(t)’ *dt (4.2.36) 


so that 


: \n Beier 2.1 n+] __ nt1,_ Me AeA 
if W(r'yrdW(e') = [W(1) Wty"""] — 5 fwo dt. | (4.2.37) 


c) Two Kinds of Integral 
We note that for each G(t) there are two kinds of integrals, namely, 


fG(t’)dt’ and f G(’)awr’), 


both of which occur in the previous equation. There is, in general, no connection 
between these two kinds of integral. 


d) General Differentiation Rules 
In forming differentials [as in(b) above], one must keep all terms up to second order 
in dW(t). This means that, for example, 


d{exp [W(t)]} = exp [W(t) + dW(t)] — exp [W(t)] 
= exp [W(t)][dW(t) + 4dW(t)’] 
= exp [W(t)][dW(t) + hat] (4.2.38) 


or more generally, 
df(W(t), t] = Da + > Lesa +3 oF amy+4 1 aS wor 


nf 
+ awa aw) +. 


and we use (dt)? —-0 
dtdW(t)—-0  [Sect. 4.2.5, comment (iv)] 
[dW(t)]? = dt 


and all higher powers vanish, to arrive at 
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df(W(t), t] = (S24 oe oS) dt +5 + a a yo(t). (4.2.39) 
e) Mean Value Formula 
For nonanticipating G(t), 

Cf GQ" )dW(t'y = 0. (4.2.40) 


For, since G(t) is nonanticipating, in the definition of the stochastic integral, 
o2 G;,AW, = > (Gi) (AW, = 0 (4.2.41) 


and we know from Sect. 2.9.5 that operations of ms-lim and ¢< ) may be inter- 
changed. Hence, taking the limit of (4.2.41), we have the result. 

This result is not true for Stratonovich’s integral, since the value of G,_, is 
chosen in the middle of the interval, and can be correlated with AW,. 


f) Correlation Formula 
If G(t) and A(t) are arbitrary continuous nonanticipating functions, 


t t ¢ t 
(f G(t)dWit') { H(t')\dW(t')y = fdt <G(t')H(t’)y . (4.2.42) 


Proof. Notice that 


(GAWD HAW, = (GH (AW) 


+ (Ei GiaHya + GrMs)AW/AW,) . (4.2.43) 


In the second term, AW, is independent of all other terms since 7 < i, and Gand H 
are nonanticipating. Hence, we may factorise out the term <AW,>) = 0 so that this 
term vanishes. Using 


(AW?) = At, (4.2.44) 


and interchanging mean and limit operations, the result follows. 
Formally, this is equivalent to the idea that Langevin terms (ft) are delta corre- 
lated and uncorrelated with F(t) and G(t). For, rewriting 


dW(t) — E(t)dt , (4.2.45) 


it is clear that if F(t) and G(t) are nonanticipating, €(t) is independent of them, and 
we get 
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f dt’ f ds! (G(t)H(sE(tVE(S')> = ff at's! CGH H(s")> <E(KDEs") 


J 
f dt’<G(t')H(t')) (4.2.46) 


0 


which implies 
(E(t )c(s)> = O(t — 5). 


An important point of definition arises here, however. In integrals involving delta 
functions, it frequently occurs in the study of stochastic differential equations 
that the argument of the delta function is equal to either the upper or the lower 
limit of the integral, that is, we find integrals like 


k= i dt f(t)d(t — t,) (4.2.47) 


or 
I, = f dt f(t)8(t — 12) (4.2 48) 


and various conventions can be made concerning the value of such integrals. We 
will show that in the present context, we must always make the interpretation 


I, = flt,) (4.2.49) 
I, =0 (4.2.50) 


corresponding to counting all the weight of a delta function at the lower limit of an 
integral, and none of the weight at the upper limit. To demonstrate this, note that 


( f G(rydWwr’) Ef Hd) = 0 (4.2.51) 


This follows, since the function defined by the integral inside the square bracket is, 
by Sect.4.2.4 comment (v), a nonanticipating function and hence the complete 
integrand, [obtained by multiplying by G(t’) which is also nonanticipating] is 
itself nonanticipating. Hence the average vanishes by the result of Sect. 4.2.6e. 


Now using the formulation in terms of the Langevin source &(t), we can rewrite 
(4.2.52) as 


f dt’ { ds’ <G(’)H(s")8(t" — s’) = 0 (4.2.52) 


which corresponds to not counting the weight of the delta function at the upper 
limit. Consequently, the full weight must be counted at the lower limit. 
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This property is a direct consequence of the definition of the Ito integral as in 
(4.2.10), in which the increment points “towards the future”. That is, we can 
interpret 


dW(t) = W(t + dt) — W(t). (4.2.53) 


In the case of the Stratonovich integral, we get quite a different formula, which is 
by no means as simple to prove as in the Ito case, but which amounts to choosing 


I, = $f(t) 
I, = 4 f(tz). 


This means that in both cases, the delta function occuring at the limit of an integral 
has } its weight counted. This formula, although intuitively more satisfying than 
the Ito form, is more complicated to use, especially in the perturbation theory of 
stochastic differential equations, where the Ito method makes very many terms 
vanish. 


(Stratonovich) (4.2.54) 


4.3 Stochastic Differential Equations (SDE) 


We concluded in Sect.4.1, that the most satisfactory interpretation of the Langevin 
equation 


¢ 
a = a(x, t) + B(x, HEe(t) ue (4.3.1) 
is a stochastic integral equation 
x(t) — x(0) = f dt’alx(t"), t’] + f dWw(t)bEX(t'), £']. (4.3.2) 
0 0 


Unfortunately, the kind of stochastic integral to be used is not given by the reason- 
ing of Sect.4.1. The Ito integral is mathematically and technically the most satis- 
factory, but unfortunately, it is not always the most natural choice physically. 
The Stratonovich integral is the natural choice for an interpretation which assumes 
E(t) is a real noise (not a white noise) with finite correlation time, which is then 
allowed to become infinitesimally small after calculating measurable quantities. 
Furthermore, a Stratonovich interpretation enables us to use ordinary calculus, 
which is not possible for an Ito interpretation. 

From a mathematical point of view, the choice is made clear by the near impos- 
sibility of carrying out proofs using the Stratonovich integral. We will therefore 
define the Ito SDE, develop its equivalence with the Stratonovich SDE, and use 
either form depending on circumstances. The relationship between white noise 
stochastic differential equations and the real noise systems is explained in Sect.6.5. 
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4.3.1 Ito Stochastic Differential Equation: Definition 
A stochastic quantity x(t) obeys an Ito SDE written as 
dx(t) = a[x(t), t]dt + b[x(t), t]ldW(t) (4.3.3) 


if for all t and fo, 
x(t) = x(t) + f dt’ alx(t’), t'] + f dW(t’) bLx(r’), t']. (4.3.4) 


Before considering what conditions must be satisfied by the coefficients in (4.3.4), 
it is wise to consider what one means by a solution of such an equation and what 
uniqueness of solution would mean in this context. For this purpose, we can con- 
sider a discretised version of the SDE obtained by taking a mesh of points ¢, (as 
illustrated in Fig. 4.2) such that 


ph = yes ee (4.3.5) 
and writing the equation as 


X41 == X, + A(X, t,)At;, + b(x;, AW, . (4.3.6) 


a(xi,ti) At, 


Za _ —_} 
° Pe b( xi, ti) AWti 


to t, t. ts te ts te t 


Fig. 4.2. Illustration of the Cauchy-Euler procedure for constructing an approximate solution of 
the stochastic differential equation dx(t) = a[x(t), t]dt + b[x(t), tlaW(t) 


Here, 


xX; = x(t;) 
At, = tiay — t; (4.3.7) 
AW, = Wtis1) — W(t). 


We see from (4.3.6) that an approximate procedure for solving the equation is to 
calculate x,,, from the knowledge of x; by adding a deterministic term 


aA(x,, t;)At, (4.3.8) 
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and a stochastic term 
b(x,, t) AW, - (4.3.9) 


The stochastic term contains an element AW,, which ts the increment of the Wiener 
process, but is statistically independent of x, if (i) x9 is itself independent of all 
W(t) — W(t.) for t > ft, (thus, the initial conditions if considered random, must 
be nonanticipating) and (ii) a(x, t) is a nonanticipating function of ¢ for any fixed x. 

Constructing an approximate solution iteratively by use of (4.3.6), we see that x, 
is always independent of AW, for j > i. 

The solution is then formally constructed by letting the mesh size go to zero. 
To say that the solution is unique means that for a given sample function W(t) of 
the random Wiener process W(t), the particular solution of the equation which 
arises is unique. To say that the solution exists means that with probability one, 
a solution exists for any choice of sample function W(t) of the Wiener process W(t). 

The method of constructing a solution outlined above is called the Cauchy- 
Euler method, and can be used to generate simulations. 

However, this is not the way uniqueness and existence are usually demonstrated, 
though it is possible to demonstrate these properties this way. Existence and unique- 
ness will not be proved here. The interested reader will find proofs in [4.3]. 
The conditions which are required for existence and uniqueness in a time interval 
[t., 7] are: 


1) Lipschitz condition. a K exists nen that 
|a(x, t) — aly, t)| + |b(x, 1) — 6, tl<K\|x—yl| « (4.3.10) 


for all x and y, and all ¢ 1n the range [%, 7]. 


11) growth condition: a K exists such that for all ¢ in the range [to, T], 
|a(x, t)|* + [b(x, t)|? < K7(1 + [x]’). (4.3.11) 


Under these conditions there will be a unique nonanticipating solution x(t) in the 
range [to, T]. 

Almost every stochastic differential equation encountered in practice satisfies 
the Lipschitz condition since it is essentially a smoothness condition. However, 
the growth condition is often violated. This does not mean that no solution exists; 
rather, it means the solution may “‘explode”’ to infinity, that is, the value of x can 
become infinite in a finite time; in practice, a finite random time. This phenomenon 
occurs in ordinary differential equations, for example, 


Pe (4.3.12) 


has the general solution with an initial condition x = x, at t = 0, 


x(t) = (— at + 1/x2)7'"”. (4.3.13) 
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If ais positive, this becomes infinite when xp = (at)—'’* but if a is negative, the solu- 
tion never explodes. Failing to satisfy the Lipschitz condition does not guarantee 
the solution will explode. More precise stability results are required for one to be 
certain of that [4.3]. 


4.3.2 Markov Property of the Solution of an Ito Stochastic Differential Equation 


We now show that x(t), the solution to the stochastic differential equation (4.3.4), 
is a Markov Process. Heuristically, the result is obvious, since with a given initial 
condition x(f,), the future time development is uniquely (stochastically) deter- 
mined, that is, x(t) for t > ft) is determined only by 

F 


1) the particular sample path of W(t) for t > fo ; 
il) the value of x(t). 


Since x(t) is a nonanticipating function of t, W(t) for t > t) is independent of x(t) 
for t < t). Thus, the time development of x(t) for t > t) 1s independent of x(t) for 


t < t ) provided x(t,) is known. Hence, x(t) isa Markov process. For a precise proof 
see [4.3]. 


4.3.3 Change of Variables: Ito’s Formula 


Consider an arbitrary function of x(t): f[x(t)]. What stochastic differential equa- 
tion does it obey? We use the results of Sect.4.2.5 to expand df[x(t)] to second 
order in dW(t): 


df(x(t)] = f[x(t) + dx(t)] — f[x(2)] 
=f Px(ldx(t) + Ef DM ax)? +... 
= f(t) {alx(t), t]dt + b[x(t), t]dW(t)} 
+ $f"P(tlo[x(t), tPlaw(y’ + ..., 


where all other terms have been discarded since they are of higher order. Now use 
[dW(t)]* = dt to obtain 


df[x(t)] = talx(t), APC) + 26), tPF"T@)]} at 
+ B[x(t), tf’ P(t)]dW(a) . 


(4.3.14) 


This formula is known as Ito’s formula and shows that changing variables is 
not given by ordinary calculus unless f[x(t)] 1s merely linear in x(f). 


Many Variables. In practice, Ito’s formula becomes very complicated and the 
easiest method is to simply use the multivariate form of the rule that dW(t) is an in- 
finitesmial of order 4. By similar methods to those used in Sect.4.2.5, we can show 
that for an n dimensional Wiener process W(t), 
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dWt)dW,(t) = 6,,dt (4.3.15a) 
[dw,(t)}"*2 =0 (N> 0) (4.3.15b) 
dWt)dt =0 (4.3.15c) 
dt'+™ =0 (N>0). (4.3.15d) 


which imply that dW,(t) is an infinitesimal of order 4. Note, however, that (4.3. 15a) 
is a consequence of the independence of dW,(t) and dW,(t). To develop Ito’s for- 
mula for functions of an n dimensional vector x(t) satisfying the stochastic differen- 
tial equation 


dx = A(x, t)dt + B(x, t)dW(t), (4.3.16) 


we simply follow this procedure. The result is 


df(x) = {Ds A(x, t)0, f(x) + 4 2 [B(x, t)B*(x, t)].,0;0, f(x)} dt 


(4.3.17) 
+ 2 B,,(x, t)0,f(x)dW,(t) . 


4.3.4 Connection Between Fokker-Planck Equation and Stochastic 
Differential Equation 


We now consider the time development of an arbitrary f(x(t)). Using Ito’s formula 


<df[x(t))) /dt = (aft 


== ¢fix(op 
= Ca{x(t), 118, f + $b[x(t), P02) . (4.3.18) 


However, x(t) has a conditional probability density p(x, t| xo, fo) and 


© Cpe = J dx f008,p(% t x0 f) 
= | dx{a(x, 1)0,f + 4b(x, t)?02 f] p(x, t| xo, to) . (4.3.19) 


This is now of the same form as (3.4.16) Sect.3.4.1. Under the same conditions as 
there, we integrate by parts and discard surface terms to obtain 


J dx f(x)0,p = J dx f(x) {—4,fa(x, t)p] + 4021b(x, t)°p]} 


and hence, since f(x) is arbitrary, 


O,p(X, t| Xo, to) = —OMa(x, t)p(x, t| Xo, to)] + 442[b(x, t)°p(x, t| Xo, to)] - 


(4.3.20) 
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We have thus a complete equivalence to a diffusion process defined by a drift 
coefficient a(x, t) and a diffusion coefficient b(x, t)’. 

The results are precisely analogous to those of Sect.3.5.3, in which it was shown 
that the diffusion process could be locally approximated by an equation resembling 
an Ito stochastic differential equation. 


4.3.5 Multivariable Systems 


In general, many variable systems of stochastic differential equations can be defined 
for n variables by 


dx = A(x, t)dt + B(x, t\dW(t), (4.3.21) 


where dW(t) is an n variable Wiener process, as defined in Sect.3.8.1. The many 
variable version of the reasoning used in Sect. 4.3.4 shows that the Fokker-Planck 
equation for the conditional probability density p(x, t| xo, to) = p is 


1p = —2, OfA(x, t)p] + 4 2 9:0; {1B t)B"(x, t)].;P}. (4.3.22) 


Notice that the same Fokker-Planck equation arises from all matrices B such that 
BB" is the same. This means that we can obtain the same Fokker-Planck equation 
by replacing B by BS where S is orthogonal, 1.e., SS* = 1. Notice that S may de- 
pend on x(t). This can be seen more directly. Suppose $(t) is an orthogonal matrix 
with an arbitrary nonanticipating dependence on ft. Then define 


dV(t) = S(t)dW(t). (4.3.23) 


Now the vector d V(t) is a linear combination of Gaussian variables dW(t) with coe- 
fficients $(t) which are independent of dW(t), since $(t) is nonanticipating. For any 
fixed value of S(t), the dV(t) are thus Gaussian and their correlation matrix is 


CAV AE)AV A(t) = 21 Slt) Sim(t) AW dW mt) 
= 2 Sul) Sp(t)dt = 0,,dt (4.3.24) 


since $(t) is orthogonal. Hence, all the moments are independent of S(t) and are 
the same as those of dW(t), so dV(t) is itself Gaussian with the same correlation 
matrix as dW(t). Finally, averages at different times factorise, for example, if 
t> ft’ in 


21 (AWA) S OPEV ES AED » (4.3.25) 


we can factorise out the averages of dW,(t) to various powers since dW,(t) is in- 
dependent of all other terms. Evaluating these we will find that the orthogonal 
nature of S(t) gives, after averaging over dW,(t), simply 
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2 (dW (dwt Set) (4.3.26) 


which similarly gives <{dW,,(t)]"[dW,(t’)]">. Hence, the dV(t) are also increments of 
a Wiener process. The orthogonal transformation simply mixes up different 
sample paths of the process, without changing its stochastic nature. 


Hence, instead of (4.3.21) we can write 


dx = A(x, t)dt + B(x, t)S™(t)S(t)dW(t) (4.3.27) 
= A(x, t)dt + B(x, t)S"(t)dV(t), (4.3.28) 


and since V(t) is itself simply a Wiener process, this equation 1s equivalent to 
dx = A(x, t)dt + B(x, t)S'(t)dW(t) (4.3.29) 


which has exactly the same Fokker-Planck equation (4.3.22). 
We will return to Some examples in which this identity is relevant in Sect.4.4.6. 


4.3.6 Stratonovich’s Stochastic Differential Equation 


Stratonovich [4.2] has defined a stochastic integral of an integrand which 1s a func- 
tion of x(t) and ¢ by 


x (t;) + X Ui-1) 


si GIx(r, (dW (0) = ms-lim: > G | 5 Fig (W(t) -— W-1)] 


ve (4.3.30) 
It should be noted that only the dependence on x(f) is averaged. If G(z, f) 1s dif- 
ferentiable in ¢, the integral is independent of the particular choice of value for f 
in the range [f;-1, ¢i]. 
It is possible to write a stochastic differential equation (SDE) using Strato- 
novich’s integral 


x(t) = x(to) + f dt'alx(t’), t’] + Sf dWr)BLx(t), 1'1, (4.3.31) 


and we shall show that is equivalent to an appropriate Ito SDE. 
Let us assume that x(t) is a solution of 


dx(t) = a[x(t), t]dt + b[x(t), t]ldW(t) (4.3.32) 


and deduce the corresponding a and f. In both cases, the solution x(t) is the same 
function. We first compute the connection between S j dW(t')p[x(t'), t'] and 
fame )b[x(t’), t’]. Then, 


X(t) + x (tj-1) , 


S paw) B[x(t’), 1] = 2B ; [W(t;)-W(t;_,)]. (4.3.33) 
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In (4.3.33) we write 

X(t;) = x (Gj-1) + dx (ti-1) 
and use the Ito SDE (4.3.32) to write 

dx (t;) =a[x (4-1), Ga - G1) + Ox (4-1), tA) [MW (t) — WG-1)].- (4.3.34) 
Then, applying Ito’s formula, we can write 
x (t;) + xX (ti-1) 

wu 
= B(tj-1) + [a(ti-1) 0 B(ti-1) + 7 8? (H- DIE (i — 4-1) 
+5 b(ti-1) OB (t-1) (W(t) — W(ti-1)) (4.3.35) 


B > Ej-} = B(x (t-1) + 5 dx (ti-1), 4-1] 


(For simplicity, we write A(t,) etc, instead of B[x(z,), t,] wherever possible). Putting 
all these back in the original equation (4.3.32) and dropping as usual dt’, dt dW, and 
setting dW? = dt, we find 


S j = ys B(ti-1) (W(t) — W(4-)} 
+5 Ds b(ti-1) 0x8 (i-1) (i- Gi-1) - 
Hence we derive 


Sf Blx(t"), dW’) = f BL"), t'dWt’) + Ff BLx(t), '10.BLx(t’), Jat’. |(4.3.37) 


This formula gives a connection between the Ito and Stratonovich integrals of func- 
tions A[x(t’), t’], in which x(t’) is the solution of the Ito SDE (4.3.31). It does 
not give a general connection between the Ito and Stratonovich integrals of arbi- 
trary functions. 

If we now make the choice 


a(x, t) = a(x, t) — 4b(x, t)d,,b(x, t) 


B(x, t) = B(x, t) (4.3.38) 
We see that the Ito SDE dx =adt+ bdwi«t), (4.3.39a) 


is the same as the Stratonovich SDE dx = [a — }bd,b]dt + b dW(t), | (4.3.39b) 


or conversely, 


the Stratonovich SDE dx =adt+ BdWit) (4.3.40a) 
isthe same asthe ItoSDE dx=[a+}4f0,f]dt + BdWi(t). (4.3.40b) 
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Comments 


i) Using Ito’s formula (4.3.14) we can show that the rule for a change of variables 
in Stratonovich SDE ts exactly the same as in ordinary calculus. Start with the 
Stratonovich SDE (4.3.40a), and convert to the Ito SDE (4.3.40b). Change to the 
new variable vy = f(x) with the inverse x = g(y). 


Define a@(y) = alg())] 
Bi”) = Ble). 


Use Ito’s formula and note that df/dx = (dg/dy)~' to obtain the Ito SDE 


a ead ere aces ee 


Now convert back to a Stratonovich equation using (4.3.39); we obtain 


dy = (adt + BdW) (=) 
or 
dfix(t)] = {alx(t), tldt + BLx(t), tldwa}f'ixco] (4.3.41) 


which is the same as in ordinary calculus. 


11) Many Variables. If a many variable Ito equation is 


dx = A(x, t)dt + B(x, t)dW(t) , (4.3.42) 


“4 


then the corresponding Stratonovich equation can be shown similarly to be given 
by replacing 


Ai = A; — 2D, By,0,B;,; 

Bi, = B;,. (4.3.43) 
it) Fokker-Planck Equation corresponding to the Stratonovich SDE, 

(S) dx = A%(x, t)dt + B‘(x, t)dW(t) (4.3.44) 


can, by use of (4.3.43) and the known correspondence (Sect.4.3.5) between the 
Ito SDE and Fokker-Planck equation, be put in the form 


oH 0;,{Aip} + td 0; { Bi.0 [Bix P)} (4.3.45) 


which is often known as the “‘Stratonovich form”’ of the Fokker-Planck equation. 
In contrast to the two forms of the SDEs, the two forms of Fokker-Planck equation 
have a different appearance but are (of course) interpreted with the same rules — 
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those of ordinary calculus. We will find later that the Stratonovich form of the 
Fokker-Planck equation does arise very naturally in certain contexts (Sect.6.6). 


iv) Comparison of the Ito and Stratonovich Integrals. The Stratonovich integral as 
defined in (4.3.30) is quite a specialised concept, for it can only be defined in terms 
of a function G(z, t) of two variables. The more “obvious” definition in terms of 
G(x [5 (t; + ti-1)],5 (4+ 4-1)) was not used by Stratonovich in his original defini- 
tion, although the view that this provides the definition of the Stratonovich 
integral is widespread in the literature (including the first edition of this book). 
Apparently, the more obvious definition cannot be proved to converge — see [4.6]. 
In practise, the precise definition of the Stratonovich integral from first principles 
is of no great interest, whereas the property that the rule for change of variables is 
given by ordinary caléulus is of great significance, and this is ensured not so much 
by the definition as by the relations (4.3.37, 43) between the two kinds of integral. 
One could simply choose to define the Stratonovich integral as being given by 
(4.3.37) when the function obeys the SDE (4.3.31), and this would be mathemat- 
ically completely satisfactory, and much less confusing. 


4.3.7 Dependence on Initial Conditions and Parameters 


In exactly the same way as in the case of deterministic differential equations, if 
the functions which occur in a stochastic differential equation depend continuously 
on parameters, then the solution normally depends continuously on that parameter. 
Similarly, the solution depends continuously on the initial conditions. Let us formu- 
late this more precisely. Consider a one-variable equation 


dx = a(d, x, t)dt + b(A, x, t)dW(t) 
with initial condition | (4.3.46) 


X(to) = c(A) 
where A is a parameter. Let the solution of (4.3.49) be x(A, t). Suppose 


1) st-lim c(A) = c(A,) ; 
A-Ag 
ii) lim {sup tE[to, T][|a(A, x, t) — ao, x, t)| + |b, x, t) — bo, x, t)[]} = 0; 
Ag Ixl<N 


iil) a K independent of A exists such that 


|a(A, x, t)|? + |b(A, x, t)|? < K*1 + |x?]). 


> 


For a proof see [4.1]. 
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Comments 


i) Recalling the definition of stochastic limit, the interpretation of the limit (4.3.50) 
is that as A — A,, the probability that the maximum deviation over any finite tn- 
terval [t), T] between x(A, t) and x(Ao, ¢) is nonzero, goes to zero. 


ii) Dependence on the initial condition is achieved by letting a and b be independent 
of A. 


iii) The result will be very useful in justifying perturbation expansions. 


iv) Condition (ii) is written in the most natural form for the case that the functions 
a(x, t) and B(x, t) are not themselves stochastic. It often arises that a(x, t) and 
b(x, t) are themselves stochastic (nonanticipating) functions. In this case, condition 
(ii) must be replaced by a probabilistic statement. It is, in fact, sufficient to replace 


lim by st-lim. 
A-Ag Ado 


4.4 Some Examples and Solutions 


4.4.1 Coefficients Without x Dependence 
The simple equation ‘f 
dx = a(t)dt + b(t)dW(t) (4.4.1) 
with a(t) and b(t) nonrandom functions of time, is solved simply by integrating 
x(t) = Xo + f a(t)dt + f{ b(t)dwct). (4.4.2) 
to to 
Here, x, can be either a nonrandom initial condition or may be random, but must 
be independent of W(t) — W(t.) for t > to; otherwise, x(t) is not nonanticipating. 
As constructed, x(t) 1s Gaussian, provided x, is either nonrandom or itself Gaus- 
Slan, since 
[ b(t)dw(t) 
t9 


is simply a linear combination of infinitesimal Gaussian variables.Further, 


(x(t) = xe) + f a(t)dt 


(since the mean of the Ito integral vanishes) 
and 
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([x(t) — x(t) xs) — <x6s)pD = <x(t), x(s)) 


t s min(t, s) 
= Cf b(r’)dW(t') [ b(s‘\dW(s')) = J [(t')Par’ , 


where we have used the result (4.2.42) with, however, 


G(t')= B(t')) ti <t 
= 0 t' >t 
A(t’) = b(t’)) t'<s 
= 0 2s 


The process is thus completely determined. 


4.4.2 Miultiplicative Linear White Noise Process 
The equation 
dx = cx dW(t) (4.4.3) 


is known as a multiplicative white noise process because it is linear in x, but the 
“noise term’? dW(t) multiplies x. We can solve this exactly by using Ito’s formula 
Let us define a new variable by 


y = logx, (4.4.4) 


so that 
eee 
ame aa oa 
= ¢ dW(t) — }erdt. (4.4.5) 


This equation can now be directly integrated, so we obtain 


Y(t) = Wto) + c[W(t) —W(to)] — c(t — to) (4.4.6) 
and hence, 
x(t) = x(to) exp {c[W(t) — W(to)] — 4c°(t — to}. (4.4.7) 


We can calculate the mean by using the formula for any Gaussian variable z 
with zero mean 


<exp z> = exp (<z”)/2) (4.4.8) 


so that 
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(x(t)> = (X(to)> exp [4e7(t — to) — $e°(t — to)] 
= (X(To)> . (4.4.9) 


This result is also obvious from definition, since 
<dx) = <cx dW(t)) = 0 so that 


dix) 
ree 


We can also calculate the autocorrelation function 
(x(t)x(s)> = <x(to)’><exp {e[W(t) + W(s) — 2W(t)] — he7(t + 5 — 2t9)}> 
= (x(to)*>exp {se"[((W(t) + W(s) — 2W(t))) — (t + 5 — 2tp)]} 
= (x(to)*>exp (4c7[t + 5 — 2t) + 2min(t, s) — (t + 5 — 2f5)]} 
= (x(to)*>exp [c*min(t — fo, s — ft)]. (4.4.10) 
Stratonovich Interpretation. The solution of this equation interpreted as a Stratono- 


vich equation can also be obtained, but ordinary calculus would then be valid. 
Thus, instead of (4.4.5) we would obtain 


dy =cdWwt) a 
and hence, 

x(t) = x(to) exp {c[W(t) — W(to)]} . (4.4.11) 
In this case, 

(x(t)> = Cx(to)>exp [4e°(t — to)] (4.4.12) 


and 
(x(t)x(s)> = (x(to)*>exp {Fc7[t + s — 2to + 2min(t — fo, s — to)]}. (4.4.13) 


One sees that there is a clear difference between these two answers. 


4.4.3 Complex Oscillator with Noisy Frequency 


This is a simplification of a model due to Kubo [4.4] and is a slight generalisation of 
the previous example for complex variables. We consider 


a = ifm + /2y €(t)]z (4.4.14) 
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which formally represents a simple model of an oscillator with a mean frequency 
@ perturbed by a noise term ¢(r). 
Physically, this is best modelled by writing a Stratonovich equation 


(S) dz = ilwdt + /2ydW(t)]z (4.4.15) 
which is equivalent to the Ito equation (from Sect. 4.3.6) 

dz = [(iw — y)dt + 1./2y dW(t)]z . (4.4.16) 
Taking the mean value, we see immediately that 


“2 Spe as (4.4.17) 


with the damped oscillatory solution 
(2(t)> = exp [(iw — y)t]<z(0)) . (4.4.18) 


We shall show fully in Sect. 6.6, why the Stratonovich model is more appropriate. 
The most obvious way to see this is to note that €(t) would, in practice, be somewhat 
smoother than a white noise and ordinary calculus would apply, as is the case in the 
Stratonovich interpretation. 

Now in this case, the correlation function obtained from solving the original 
Stratonovich equation is 


(z(t)z(s)> = <z(0)?> exp [(ia — y)\(t + 5) — 2ymin(t, s)] . (4.4.19) 
In the limit t, s—- oo, witht +T=s, 

lim<z(¢ + t)z(t)) =O. (4.4.20) 
However, the correlation function of physical interest 1s the complex correlation 


<z(t)z*(s)> = <|2z(0)|7><exp {ia(t — 5) + 1./2,[W(t) — W(s)]}> 
= <|z(0)|?>exp {iw(t — s) — y[t + s — 2min(t, s)]} 
= <|z(0)|*>exp [iw(t — s) — y|t — s|]. (4.4.21) 


Thus, the complex correlation function has a damping term which arises purely 
from the noise. It may be thought of as a noise induced dephasing effect, whereby 
for large time differences, z(t) and z*(s) become independent. 

A realistic oscillator cannot be described by this model of a complex oscillator, 
as discussed by van Kampen [4.5]. However the qualitative behaviour is very simi- 


106 4. The Ito Calculus and Stochastic Differential Equations 

lar, and this model may be regarded as a prototype model of oscillators with 
noisy frequency. 

4.4.4 Ornstein-Uhlenbeck Process 


Taking the Fokker-Planck equation given for the Ornstein-Uhlenbeck process 
(Sect.3.8.4), we can immediately write down the SDE using the result of Sect.4.3.4: 


dx = — kx dt+./D dW(t), (4.4.22) 
and solve this directly. Putting 
y=xek, (4.4.23) 
then 
dy = (dx)d(e**') + (dx)e** + xd(e*) 
= [—kx dt + /D dW(t)]k e*'dt 
+ [—kx dt + /D dWwi(tle" + kx e*dt. (4.4.24) 
We note that the first product vanishgs, involving only dt?, and dW(t)dt (in fact, it 
can be seen that this will always happen if we simply multiply x by a deterministic 
function of time). We get 
dy = ./D e*'dWit) (4.4.25) 


so that integrating and resubstituting for y, we get 
x(t) = x(O)e™* + SD f ek dwt’). (4.4.26) 
8) 


If the initial condition is deterministic or Gaussian distributed, then x(t) is clearly 
Gaussian with mean and variance 


x(t)> = <x(0)pe™ (4.4.27) 


var {x(1)} = <{[x(0) — (x(0))]Je™ + /D j emF 9 dWV(t)} 2D (4.4.28) 


Taking the initial condition to be nonanticipating, that is, independent of dW(t) 
for t > 0, we can write using the result of Sect.4.2.6f 


var {x(t)} = var {x(0)}e~*** + D f ear 
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= {[var {x(0)} — D/2k}e?*! + D/2k . (4.4.29) 


These equations are the same as those obtained directly by solving the Fokker- 
Planck equation in Sect.3.8.4, with the added generalisation of a nonanticipating 
random initial condition. Added to the fact that the solution is a Gaussian variable, 
we also have the correct conditional probability. 

The time correlation function can also be calculated directly and is, 


(x(t), x(s)> = var {x(0)}e* "CF" + D<f e FO dwt’) f eke“ dW(s")) 


min (¢, s) 


=. var {x(0)} ea k (ets) te D{ ek ets—20n yp! 
i) 
ae —— D —k (tts) D —kit—s| 
= var {x(0)} i e +. aE (4.4.30) 


Notice that if k > 0, as t, s—+ oo with finite |t — s|, the correlation function be- 
comes stationary and of the form deduced in Sect.3.8.4. 

In fact, if we set the initial time at — co rather than 0, the solution (4.4.26) 
becomes 


x(t) = JD f ek dwt’). (4.4.31) 


in which the correlation function and the mean obviously assume their stationary 
values. Since the process is Gaussian, this makes it stationary. 
4.4.5 Conversion from Cartesian to Polar Coordinates 


A model often used to describe an optical field is given by a pair of Ornstein-Uh- 
lenbeck processes describing the real and imaginary components of the electric 
field, i.e., 


dE\(t) = — yE,(t) dt + ¢ dW,(t) 
dE,t) = — yE,(t)dt + ée dW{t). (4.4.32) 
It is of interest to convert to polar coordinates. We set 


E\(t) = a(t)cos g(t) 
E(t) = a(t)sin 6(t) (4.4.33) 
and for simplicity, also define 


u(t) = log a(t) (4.4.34) 
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so that 


H(t) + i¢(t) = log [E\(t) + iF.(¢)]. (4.4.35) 


We then use the Ito calculus to derive 


d(E, +iE,) 1 [d(E, + iF)? 
E, + iE, 2 (£, + i£,) 


ELH iES) gy, 5 MAACO) + idW00) 
EF, ate \E, (E, + 1F3) 


1 e'[dW,(t) + 1dWt)P 
—%9 TE VIE (4.4.36) 


d(u + ig) = 


and noting dW,(t)dW,(t) = 0, dW,(t)? = dW,(t)* = dt, it can be seen that the last 
term vanishes, so we find 


d[u(t) + id(t)] = —ydt + eexp [—w(t) — id] {aWi(t) +1 dW2(t)}. (4.4.37) 
We now take the real part, set a(t) = exp [u(t)] and using the Ito calculus find 
da(t) = {—ya(t) + te?/a(t)} dt + e{dW,(t)cosg(t) + dW,(t)sin d(t)]}. (4.4.38) 


The imaginary part yields 


: 
dg(t) = [e/a(t)] [—dW,(¢)sin d(t) + dW,cos ¢(t)] . | (4.4.39) 
We now define 


dW (t) = dW,(t)cos d(t) + dW,(t)sin g(t) 


(4.4.40) 
dW,(t) = — dW,(t)sin g(t) + dW,(t)cos g(t) . 
We note that this is an orthogonal transformation of the kind mentioned in Sect. 
4.3.5, so that we may take dW,,(t) and dW,(t) as increments of independent Wiener 
processes W,(t) and W,(¢). 

Hence, the stochastic differential equations for phase and amplitude are 


dg(t) = [e/a(t)]dW,(t) (4.4.41a) 
da(t) = [—ya(t) + he’/a(t)]dt + edW{t) . (4.4.41b) 


Comment. Using the rules given in Sect. 4.3.6 (11), it is possible to convert both the 
Cartesian equation (4.4.32) and the polar equations (4.4.41) to the Stratonovich 
form, and to find that both are exactly the same as the Ito form. Nevertheless, a 
direct conversion using ordinary calculus is not possible. Doing so we would get 
the same result until (4.4.38) where the term [$e?/a(t)]dt would not be found. 
This must be compensated by an extra term which arises from the fact that the 
Stratonovich increments dW,(t) are correlated with g(t) and thus, dW,(t) and 
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dW,(t) cannot simply be defined by (4.4.40). We see the advantage of the Ito 
method which retains the statistical independence of d W(t) and variables evaluated 
at time ¢. 

Unfortunately, the equations in Polar form are not soluble, as the corresponding 
Cartesian equations are. There is an advantage, however, in dealing with polar 
equations in the laser, whose equations are similar, but have an added term pro- 
portional to a(t)*dt in (4.4.41b). 


4.4.6 Miultivariate Ornstein-Uhlenbeck Process 
we define the process by the SDE 
dx(t) = — Ax(t)dt + B dW(t), (4.4.42) 


(A and B are constant matrices) for which the solution 1s easily obtained (as in Sect. 
4.4.4): 


x(t) = exp (—At)x(0) + f exp[—A(t — r)]B dW). (4.4.43) 
The mean Is 


<x(t)> = exp (—At)<x(O) . (4.4.44) 


The correlation function follows similarly 


(x(t), xs) = Cix(t) — (x(t) IIx) — <x] 
exp (— At)<x(0), x*(0)exp (— As) 
min(t,s 


+f ‘exp [—A(t — t)]BBTexp[—AMs — t)]dt’. (4.4.45) 


The integral can be explicitly evaluated in certain special cases, and for particular 
low-dimensional problems, it is possible to simply multiply everything out term 
by term. In the remainder we set <x(0), x™(0)> = 0, corresponding to a deter- 
ministic initial condition, and evaluate a few special cases. 


a) Suppose 4A™ = ATA 
Then we can find a unitary matrix S such that 


SS* = 1 
SAS* = SATS* = diag(d,, Ay, ...dn)- (4.4.46) 


For simplicity, assume t > s. Then 
(x(t), x"(s)>) = S*G(t, s)S, 


where 
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(66 Ny = Fe lexp (— Alt — 51) — exp(— At — 4,5). (4.4.47) 


b) Variance in Stationary Solution 
If A has only eigenvalues with positive real part, a stationary solution exists of 
the form 


n= exp[—A(t — 1)]BdW(t) . (4.4.48) 


We have of course 


(x,(t)) = 0 

an oe (4.4.49) 
(x.(t), x1()) =f exp [—A(t — 2’) BBTexp [—AT(s — 1’)]dt" . 

Let us define the stationary covariance matrix ao by 
o = (x,(t), x(t) . (4.4.50) 


Then the evaluation of this quantity can be achieved algebraically for 
Ao + oAT = [ Aexp[—A(t — t’)]BBTexp [—A™(t — t’)Jdt’ 
+ f exp[—A(t —‘r)]BBTexp [—A™t — 1) ATat’ 
: d , , / 
= a {exp [— A(t — t’)]BBTexp [— A(t — t’)]} dt’ . 


Carrying out the integral, we find that the lower limit vanishes by the assumed posi- 
tivity of the eigenvalues of A and hence only the upper limit remains, giving 


Ag + cA'™ = BB’ (4.4.51) 
as an algebraic equation for the stationary covariance matrix. 
c) Stationary Variance for Two Dimensions 
We note that if Ais a2 x 2 matrix, it satisfies the characteristic equation 
A* — (Tr A)A + (Det A) = 0 (4.4.52) 


and from (4.4.49) and the fact that (4.4.52) implies exp(— Af) is a polynomial of de- 
gree 1 in A, we must be able to write 


o = aBB' + B(ABB' + BBA’) + yABB'A’. 


Using (4.4.52), we find (4.4.51) is satisfied if 


4.4 Some Examples and Solutions 111 


a+ (Tr A)f — (Det A)y = O 
26(Det A) + 1 = 0 
B+ (Tr Ajy=0. 


From which, 


7 — (Det A)BBT + [A — (Tr ANB A — (Tr ADIT" 


2 (Tr A) (Det A) (4.4.53) 


d) Time Correlation Matrix in the Stationary Sate 
From the solution of (4.4.49), we see that if t > s, 


(x(t), x1(3)) = exp [—A(t — 5)] f exp [—A(s — 1) BBTexp [— As — 1d 
= exp[— A(t — Jo : t>s (4.4.54a) 
and similarly, 
= g exp[—A(s — 2)] t<s (4.4.54b) 


which gives the dependence on s — ¢ as expected of a stationary solution. Defining 
then 


G.(t — s) = <x,(t), x3(s)> , (4.4.55) 
we see (remembering o = a‘) that 

Gt — s) = [G(s — t)]’. (4.4.56) 
e) Spectrum Matrix in Stationary State 


The spectrum matrix turns out to be rather simple. We define similarly to Sect. 
1.4.2: 


S(w) = 5 f et G.(r)dr (4.4.57) 


=x \fexp[— (io + A)tlo de + fo expl(— iw + Arar 
= 5 ((A + iw) + oA? — io). 


Hence, (A + iw)S(w)(A? — iw ) = 5" (cA™ + Ac) and using (4.4.51), we get 


Soy 5 (A + ia) BBT(AT — iw). (4.4.58) 
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f) Regression Theorem 

The result (4.4.54a) is also known as a regression theorem in that it states that the 
time development G,(t) is for t > 0 governed by the same law of time development 
of the mean [e.g.,(4.4.44)]. It is a consequence of the Markovian linear nature of 
the problem. For 


7 [G,(t)]at = 2 x,(t), x5(0)>dt 


= <{[—Ax,(t)dt + BdW(r)], x70) (4.4.59) 


and since t > 0, dW(t) is uncorrelated with x1(0), so 
d 
5, |G.(0)] = —A G(r). (4.4.59) 


Thus, computation of G,(t) requires the knowledge of G,(0) = o and the time 
development equation of the mean. This result is similar to those of Sect.3.7.4. 


4.4.7 The General Single Variable Linear Equation 


a) Homogeneous Case 
We consider firstly the homogeneous case 


dx = [b(t)dt + g(t)dW(t)|x ; (4.4.60) 
and using the usual Ito rules, write 
y =logx (4.4.61) 
so that 
dy = dx/x — }(dx)’/x’ 
= [b(t)dt + g(t)dW(t)] — 4g(t)dt, (4.4.62) 


and integrating and inverting (4.4.61), we get 


x(t) = x(0) exp if (b(t) — 4e(t’Jdt’ + fee yawe’} (4.4.63) 
= x(O)¢(t) (4.4.64) 


which serves to define ¢(f). 
We note that [using (4.4.8)] 


[x()I"> = <[x(0)]") (exp nf [b(t") — gg(t')’]dt" + n f g(t')dW(t }) 


= <[x(0)]"> exp | n f b(t')dt' + tn(n — 1) f g(t’at'| (4.4.65) 
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b) Inhomogeneous Case 
Now consider 


dx = [a(t) + b(t)x]dt + [f(t) + g(t)x]dW(t) (4.4.66) 
and write 
z(t) = x(t)[d(t)}"' (4.4.67) 


with g(t) as defined in (4.4.64) and a solution of the homogeneous equation 
(4.4.60). Then we write 


= dx[g(t))' + x d[g(t)"'] + dx d[g(t)”']- 


Noting that d[d(t)]"' = —dd(t)[d(t)]-? + [dd(t)P[d()]-* and using Ito rules, we 
find 


= {[a(t) — f(t)e(t)ldt + f(t)dW(t)} d(t)™ (4.4.68) 


which is directly integrable. Hence, the solution is 


x(t) = g(t )} x00) a f p(t’) {[a(t’) — f(eg(t’)] at! + ft )awty}t. (4.4.69) 


c) Moments and Autocorrelation 

It is better to derive equations for the moments from (4.4.66) rather than calculate 
moments and autocorrelation directly from the solution (4.4.69). 

For we have 


nx(t)"—'dx(t) + mn x(t)"~?[dx(t)]? 
mn ot =<) 


d[x(t)"] 


nx(t)"~'dx(t) + ——.—— x(t)’ f(t) + 2(t)x(t)]?dt . (4.4.70) 


Hence, 


Fy = x(t] nb) + 7 eeey | 
+ GY [nalt) + n(n — DADs) (4.4.71) 
+ Gxtery| MP pry. 


These equations from a hierarchy in which the ath equation involves the solutions 
of the previous two, and can be integrated successively. 
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4.4.8 Multivariable Linear Equations 


a) Homogeneous Case 
The equation is 


dx(t) = [B(t)dt + es G,(t)dW,(t)]x(t) , (4.4.72) 


where B(t), G,(t) are matrices. The equation is not, in general, soluble in closed form 
unless all the matrices B(t), G,(t') commute at all times with each other, i.e. 


G(t)G,(t') = G,(t’)G{t) 
B(t)G(t’) = Gt’) B(t) (4.4.73) 
B(t)B(t') = B(t') B(t) 


In this case, the solution is completely analogous to the one variable case and we 
have 


x(t) = &(t)x(0) 


with 
(1) = exp j [B(t) — $3 Gta + j SG(t)aWw dey}. (4.4.74) 
b) Inhomogeneous Case e 


We can reduce the inhomogeneous case to the homogeneous case in exactly the 
same way as in one dimension. Thus, we consider 


dx(t) = [A(t) + B(t)x]dt + 2 [F.(t) + G(t)x]dW(t) (4.4.75) 
and write 
y(t) = w(t) 'x(t) , (4.4.76) 


where y(t) is a matrix solution of the homogeneous equation (4.4.72). We first have 
to evaluate d[y~']. For any matrix M we have MM~ = 1, so, expanding to second 
order, Md[M~'] + dM M~' + dM d[M™"] = 0. 
Hence, d[M~'] = —[M + dM]"'dM M~' and again to second order 

d[M—"] = —M'dM M"' + M~'dM M"'dM M" (4.4.77) 
and thus, since y(t) satisfies the homogeneous equation, 


d(w(t)"] = wt)" {[-B) + > G(t)]dt — > G(t)dW{(t)} . 


and, again taking differentials 
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dy(t) = wt)" {[A(t) — DGG) F Olde + 22 FNdWie)}. 
Hence, 


(1) = WO) {2O) + F VET LAC) — HGRA! + DFW. 
(4.4.78) 


This solution is not very useful for practical purposes, even when the solution for 
the homogeneous equation is known, because of the difficulty in evaluating means 
and correlation functions. 


4.4.9 Time-Dependent Ornstein-Uhlenbeck Process 
This is a particular case of the previous general linear equation which is soluble. It 
is a generalisation of the multivariate Ornstein-Uhlenbeck process (Sect.4.4.6) to 
include time-dependent parameters, namely, 

dx(t) = —A(t)x(t)dt + B(t)dW(t). (4.4.79) 


This is clearly of the same form as (4.4.75) with the replacements 


A(t) — 0 

B(t) — —A(t) 

>: Fi(t)dW (t) + Bt)dW(t) (4.4.80) 
Gt) — 0. 


The corresponding homogeneous equation is simply the deterministic equation 
dx(t) = — A(t)x(t) dt (4.4.81) 
which is soluble provided A(t)A(t’) = A(t')A(t) and has the solution 
x(t) = w(t)x(0) 
with 
v(t) = exp[— f A(t')dt’) (4.4.82) 
Thus, applying (4.4.78), 


x(t) = exp [— f A(r)dt'}x(0) + f {exp [— f A(s)ds]} B(t'’)dW(t'). (4.4.83) 


This is very similar to the solution of the time-independent Ornstein-Uhlenbeck 
process, as derived in Sect. 4.4.6 (4.4.43). 
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From this we have 
(x(1)) = exp [— | ADAP Kx) (4.4.84) 
(x(t), x7(t)> = exp [— j A(t’)dt"|<x(0), x(0)*> exp [— j At(t’)dt’] 

+ f dt'exp[— { A(s)ds]B(t’)B"t’)exp[— { AT(s)ds]. (4.4.85) 


The time-dependent Ornstein-Uhlenbeck process will arise very naturally in 
connection with the development of asymptotic methods in low-noise systems. 


5. The Fokker-Planck Equation 


In this rather long chapter, the theory of continuous Markov processes is developed 
from the point of view of the corresponding Fokker-Planck equation, which gives 
the time evolution of the probability density function for the system. The chapter 
divides into two main subjects—single variable and multivariable processes. There 
are a large number’of exact results for single variable systems, which makes the 
separate treatment of such systems appropriate. Thus Sect. 5.2 and its subsections 
treat all aspects of one variable systems, whereas Sect. 5.3 deals with multivariable 
systems. However, the construction of appropriate boundary conditions is of 
fundamental importance in both cases, and is carried out in general in Sect. 5.2.1. 
A corresponding treatment for the boundary conditions on the backward Fokker- 
Planck equation is given in Sect. 5.2.4. The remaining subsections of Sect. 5.2 are 
devoted to a range of exact results, on stationary distribution functions, properties 
of eigenfunctions, and exit problems, most of which can be explicitly solved in the 
one variable case. 

Section 5.3 and its subsections explore exact results for many variable systems. 
These results are not as explicit as for the one variable case. An extra feature which 
is included is the concept of detailed balance in multivariable systems, which is 
almost trivial in one variable systems, but leads to very interesting conclusions in 
multivariable systems. 

The chapter concludes with a treatment of exact results in exit problems for 
multivariable Fokker-Planck equations. 


5.1 Background 


We have already met the Fokker-Planck equation in several contexts, starting from 
Einstein’s original derivation and use of the diffusion equation (Sect.1.2), again as 
a particular case of the differential Chapman-Kolmogorov equation (Sect.3.5.2), 
and finally, in connection with stochastic differential equations (Sect.4.3.4). There 
are many techniques associated with the use of Fokker-Planck equations which lead 
to results more directly than by direct use of the corresponding stochastic differen-' 
tial equation; the reverse is also true. To obtain a full picture of the nature of diffu- 
sion processes, one must study both points of view. 

The origin of the name ‘“‘Fokker-Planck Equation” 1s from the work of Fokker 
(1914) [5.1] and Planck (1917) [5.2] where the former investigated Brownian mo- 
tion in a radiation field and the latter attempted to build a complete theory of fluc- 
tuations based on it. Mathematically oriented works tend to use the term “‘Kolmo- 
gorov’s Equation”’ because of Kolmogorov’s work in developing its rigorous basis 
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[5.3]. Yet others use the term ““Smoluchowski Equation” because of Smoluchowski’s 
original use of this equation. Without in any way assessing the merits of this termi- 
nology, I shall use the term ““Fokker-Planck equation” as that most commonly used 
by the audience to which this book is addressed. 


5.2 Fokker-Planck Equation in One Dimension 


In one dimension, the Fokker-Planck equation (FPE) takes the simple form 


POD — -2 [A(x, t) f(x, t)] + = <. [B(x, t) f(x, t)]. | (5.2.1) 


In Sects.3.4,5, the FPE was shown to be valid for the conditional probability, 
that is, the choice 


f(x, t) = p(x, t| Xo; to) (5.2.2) 
for any initial x9, fo, and with the initial condition 
P(X, to| Xo, to) = 6(% — Xo) . (5.2.3) 


However, using the definition for the one time probability 
. 


p(x, t) = f AX, P(X, t3 Xo, to) = f AX P(X, t | Xo, to)P(Xos to) 5 + (5.2.4) 


we see that it is also valid for p(x, t) with the initial condition 


p(x, t) | rt, = P(x, to) 9 (5.2.5) 


which is generally less singular than (5.2.3). 

From the result of Sect.4.3.4, we know that the stochastic process described 
by a conditional probability satisfying the FPE is equivalent to the Ito stochastic 
differential equation (SDE) 


dx(t) = A[x(t), t]dt + /B[x(t), t]dW(t) 


and that the two descriptions are to be regarded as complementary to each other. 
We will see that perturbation theories based on the FPE are very different from 
those based on the SDE and both have their uses. 


5.2.1 Boundary Conditions 


The FPE is a second-order parabolic partial differential equation, and for solutions 
we need an initial condition such as (5.2.5) and boundary conditions at the end of 
the interval inside which x is constrained. These take on a variety of forms. 
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It is simpler to derive the boundary conditions in general, than to restrict con- 
sideration to the one variable situation. We consider the forward equation 


0 ] oO? 
We note that this can also be written 


ats, Da oe Teh (5.2.7) 


where we define the probability current 
+} 


Fiat) = A(t, 1) pe) — FOS By (2, Opa, 1) (5.2.8) 


Equation (5.2.7) has the form of a local conservation equation, and can be written 
in an integral form as follows. Consider some region R with a boundary S and 
define 


P(R, t) = f dz p(z, t) 


then (5.2.7) 1s equivalent to 


CED) 2 = Caseetest) (5.2.9) 
or . 

where a is the outward pointing normal to S. Thus (5.2.9) indicates that the total 

loss of probability is given by the surface integral of J over the boundary of R. We 

can show as well that the current J does have the somewhat stronger property, that 

a surface integral over any surface S gives the net flow of probability across that 

surface. 


S, 


Fig. 5.1. Regions used to demonstrate that the 
probability current is the flow of probability 


For consider two adjacent regions R, and R,, separated by a surface S§,,. Let 
S', and S, be the surfaces which, together with §,,, enclose respectively R, and R, 
(see Fig. 5.1). 

Then the net flow of probability can be computed by noting that we are dealing 
here with a process with continuous sample paths, so that, in a sufficiently short 
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time Aft, the probability of crossing S$, from R, to R, is the joint probability of 
being in R, at time ¢ and R, at time ¢ + At, 


= f{ dx { dy p(x,t + At;y, 2). 
Rj R2 


The net flow of probability from R, to R, is obtained by subtracting from this the 
probability of crossing in the reverse direction, and dividing by Af; i.e. 


lim ais dx | dy [p(x, t + At; y, t) — p(y, t + At; x, 1)]. (5.2.10) 
Ar—0 


Note that 
J dx f dy p(x, t;y, t) =0 
R| R2 


since this is the probability of being in R, and R, simultaneously. Thus, we can 
write 


(5.2.10) = see | ay eet v2) On PCy, t'5 X, tera, 


R2 


and using the Fokker-Planck equation in the form (5.2.7) 


F) F) 
aa J dx D ax, J, (x,t; Ro, t) + dy > 8y, J, (y, t; Ry, t) (5.2.11) 


ad 


where J, (x, t; R,, t) is formed from 
P(x, t; Ry, t) = J dy p(x, t; y, t) 
R2 


in the same way as J(z, f) is formed from p(z, t)in (5.2.8) and J, (y, t; R, t) is defined 
similarly. We now convert the integrals to surface integrals. The integral over S, 
vanishes, since it will involve p(x, t; R,, t), with x not in R, or on its boundary 
(except for a set of measure zero.) Similarly the integral over S, vanishes, but those 
over §,, do not, since here the integration 1s simply over part of the boundaries of 
R, and R,. 

Thus we find, the net flow from R, to R, is 


r) dS n-{J(x, t; Ri, t) + I(x, t; R,, O} 
12 


and we finally conclude, since x belongs the union of R, and R,, that the net flow 
of probability per unit time from R, to R, 


= lim ail % | J dy [p(x,t + At; y,t) — ply,t + At;x,1)] =e dS n- I(x, t) 


where a” points ea R, to R, (5.2.12) 
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We can now consider the various boundary conditions separately. 


a) Reflecting Barrier 
We can consider the situation where the particle cannot leave a region R, hence 
there is zero net flow of probability across S, the boundary of R. Thus we require 


n-J(z,t)=0 forze S, n=normaltoS (5.2.13) 


where J(z, f) is given by (5.2.8). 
Since the particle cannot cross S, it must be reflected there, and hence the 
name reflecting barrjer for this condition. 


b) Absorbing Barrier 
Here, one assumes that the moment the particle reaches S, it is removed from the 
system, thus the barrier absorbs. Consequently, the probability of being on the 
boundary is zero, 1.e. 


p(z, t) = 0 forze S$ (5.2.14) 


c) Boundary Conditions at a Discontinuity 

It is possible for both the A; and B,, coefficients to be discontinuous at a surface S, 
but for there to be free motion across §. Consequently, the probability and the 
normal component of the current must both be continuous across §, 


n-J(Z)| 5, = a-J(z)|s_ (5.2.15) 
P(Z)|s, = p(z)|s_ (5.2.16) 


where S,, S_, as subscripts, mean the limits of the quantities from the left and right 
hand sides of the surface. 

The definition (5.2.8) of the current, indicates that the derivatives of p(z) are 
not necessarily continuous at S. 


d) Periodic Boundary Condition 

We assume that the process takes place on an interval [a, b] in which the two end 
points are identified with each other. (this occurs, for example, if the diffusion is 
on a circle). Then we impose boundary conditions derived from those for a discon- 
tinuity, 1.e., 


I: lim p(x, t) = lim p(x, t) (5.2.17) 
x—7b— x-at 

II: lim J(x, t) = lim J(x, 2). (5.2.18) 
x—7b— x—at 


Most frequently, periodic boundary conditions are imposed when the functions 
A(x, t) and B(x, t) are periodic on the same interval so that we have 


A(b, t) = A(a, t) (5.2.19) 
B(b, t) = B(a, t) 
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and this means that I and II simply reduce to an equality of p(x, t) and its derivatives 
at the points a and b. 


e) Prescribed Boundaries 

If the diffusion coefficient vanishes at a boundary, we have a situation in which 
the kind of boundary may be automatically prescribed. Suppose the motion occurs 
only for x > a. If a Lipschitz condition 1s obeyed by A(x, f) and ) B(x, ft) at x=a 
(Sect. 4.3.1) and B(x, ¢) is differentiable at x = a then 


0,B(a, t)h=0. (5.2.20) 
The SDE then has solutions, and we may write 


dx (t) = A(x, t)dt+V B(x, t) dW(t) (5.2.21) 


In this rather special case, the situation is determined by the sign of A (x, t). Three 
cases then occur, as follows. 


1) Exit boundary. In this case, we suppose 
A(a,t) <0 (5.2.22) 


so that if the particle reaches the point a, it will certainly proceed out of region to 
x <a. Hence the name “exit boundary” 


ii) Entrance boundary. Suppose $ 
A(a,t)>0. (5.2.23) 


In this case, if the particle reaches the point a, the sign of A(a, t) is such as to 
return it to x > a; thus a particle placed to the right of a can never leave the region. 
However, a particle introduced at x =a will certainly enter the region. Hence the 
name, “entrance boundary”. 


111) Natural boundary. Finally consider 
A(a, t)=0. (5.2.24) 


The particle, once it reaches x = a, will remain there. However it can be demon- 
strated that it cannot ever reach this point. This is a boundary from which we can 
neither absorb nor at which we can introduce any particles. 

Feller [5.4] has shown that in general the boundaries can be assigned to one of 
the four types; regular, entrance, exit and natural. His general criteria for the 
classification of these boundaries are as follows. Define 


f(x) =exp|—2 J ds A(s)/B(s) (5.2.25) 
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g(x) = 2 B(x) f(0)] (5.2.26) 
hy (x) =f(x) f g(s) ds (5.2.27) 
hy (x) = g(x) [ f(s) ds. (5.2.28) 


Here x € (a, 5), and 1s fixed. Denote by 
S(X,, X2) (5.2.29) 


the space of all functions integrable on the interval (x;, x2); then the boundary at a 
can be classified as 


I: Regular: if f(x) € “(a, x9), and g(x) € “ (4, Xo) 
I: Exit: if g(x) € “(a Xo), and h, (x) € # (4, Xo) 
III: Entrance: if g(x) € “(a, Xo), and h(x) € / (a, Xo) 


IV: Natural : all other cases. 


It can be seen from the results of Sect. 5.2.2 that for an exit boundary there is no 
normalisable stationary solution of the FPE, and that the mean time to reach the 
boundary, (5.2.161), is finite. Similarly, if the boundary is exit, a stationary 
solution can exist, but the mean time to reach the boundary Is infinite. In the case 
of a regular boundary, the mean time to reach the boundary is finite, but a 
Stationary solution with a reflecting boundary at a does exist. The case of natural 
boundaries is harder to analyse. The reader is referred to [5.5] for a more complete 
description. 


f) Boundaries at Infinity 
All of the above kinds of boundary can occur at infinity, provided we can si- 
multaneously guarantee the normalisation of the probability which, if p(x) is rea- 
sonably well behaved, requires 

lim p(x, t) =0. (5.2.30) 
If 0, p(x) is reasonably well behaved (i.e., does not oscillate infinitely rapidly as 
x — 00), 


lim 0, p(x, t) = 0 (5.2.31) 


so that a nonzero current at infinity will usually require either A(x, t) or B(x, t) to 
become infinite there. Treatment of such cases is usually best carried out by 
changing to another variable which is finite at x = oo. 

Where there are boundaries at x = + oo and nonzero currents at infinity are 
permitted, we have two possibilities which do not allow for loss of probability: 
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i) I(t ~, t) =0 (5.2.32) 
ii) J(-+ 00, f) = J(—00, t). (5.2.33) 


These are the limits of reflecting and periodic boundary conditions, respectively. 


5.2.2 Stationary Solutions for Homogeneous Fokker-Planck Equations 


We recall (Sect.3.7.2) that in a homogeneous process, the drift and diffusion coef- 
ficients are time independent. In such a case, the equation satisfied by the stationary 
distribution is 


f(A) — > $5 [Bp] = 0 (5.2.34) 
which can also be written simply in terms of the current (as defined in Sect.5.2.1) 


di(x) 


7, = 0 (5.2.35) 


which clearly has the solution 

J(x) = constant. (5.2.36) 
Suppose the process takes place on an interval (a, b). Then we must have 

J(a) = J(x) = J(b) = J | : (5.2.37) 
and if one of the boundary conditions 1s reflecting, this means that both are reflect- 
ing, and J = 0. 


If the boundaries are not reflecting, (5.2.37) requires them to be periodic. We 
then use the boundary conditions given by (5.2.17,18). 


a) Zero Current—Potential Solution 
Setting J = 0, we rewrite (5.2.37) as 


A(x)p.x) = > £1B(x)p.(x)] = 0 (5.2.38) 
for which the solution is 

PAX) = BG ia orl? f dx' A(x’)/B(x')] , (5.2.39) 
where -/ is a normalisation constant such that 


f dx p(x) =1. (5.2.40) 
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Such a solution is known as a potential solution, for various historical reasons, but 
chiefly because the stationary solution is obtained by a single integration (the full 
significance of this term will be treated in Sect.5.3.3). 


b) Periodic Boundary Condition 
Here we have nonzero current J and we rewrite (5.2.36) as 


A(x)p,(00) — + £ RCAC =¥. (5.2.41) 


However, J is not arbitrary, but is determined by normalisation and the periodic 
boundary condition 


=e. (5.2.42) 
J(a) = J(b). (5.2.43) 


For convenience, define 

w(x) = exp [2 i dx' A(x’) B(x’)) . (5.2.44) 
Then we can easily integrate (5.2.41) to get 

pa(x)B(x)/W(x) = pa) B(a)|y(a) 2 f dx'/y(e'). (5.2.45) 
By imposing the boundary condition (5.2.42) we find that 

J = [BOIv®) — Balw(a)lp.ca f awe’) (5.2.46) 


so that 


; dx’ B(b) f dx’ B(a) 

a W(x’) wb) x W(x’) Wa) 
B(x) Pax" : 
w(x)3 w(x’) 


p.(x) = p,(a) (5.2.47) 


c) Infinite Range and Singular Boundaries 

In either of these cases, one or the other of the above possibilities may turn out to 
be forbidden because of divergences, etc. A full enumeration of the possibilities 1s, 
in general, very complicated. We shall demonstrate these by means of the examples 
given in the next section. 
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5.2.3. Examples of Stationary Solutions 


a) Diffusion in a Gravitational Field 
A strongly damped Brownian particle moving in a constant gravitational field is 
often described by the SDE (Sect.6.4) 


dx = —g dt + /D dWt) (5.2.48) 


for which the Fokker-Planck equation 1s 


Op p- ul 


On the interval (a, 6) with reflecting boundary conditions, the stationary solution is 
given by (5.2.39), 1.e. 


P(x) = SM exp [—2gx/D], (5.2.50) 


where we have absorbed constant factors into the definition of / 

Clearly this solution is normalisable on (a, 5) only if a a is finite, though b may 
be infinite. The result is no more profound than to say that particles diffusing in a 
beaker of fluid will fall down, and if the beaker is infinitely deep, they will never 
stop falling! Diffusion upwards against gravity is possible for any distance but 
with exponentially small probability. . 

Now assume periodic boundary conditions on (a, b). Substitution into (5.2.47) 
ylelds ae 


p(x) = p.(a); (5.2.51) 


a constant distribution. 
The interpretation is that the particles pass freely from a to b and back. 


b) Ornstein Uhlenbeck Process 
We use the notation of Sect.3.8.4 where the Fokker-Planck equation was 


2 
Op _ @ (kxp) + i D o"P | (5.2.52) 


whose stationary solution on the interval (a, 5) with reflecting barriers is 
px) = M exp (—kx?/D). (5.2.53) 


Provided k > 0, this is normalisable on (— ~, oo). If kK < 0, one can only make 
sense of it on a finite interval. 
Suppose 


a=—b<0. (5.2.54) 
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Then in this case, 
k 2 2 
w(x) = exp =a (x? — a’) (5.2.55) 


and if we consider the periodic boundary condition on this interval, by noting 
w(a) = y(— 4), (5.2.56) 


we find that 


psx) = payy(x)/y(a) 


= pila) exp| — 5 (3? — “| 


so that the symmetry yields the same solution as 1n the case of reflecting barriers. 
Letting a — ©, we See that we still have the same solution. The result is also true 
if a—- co independently of b —- —o, provided k > 0. 


c) A Chemical Reaction Model 
Although chemical reactions are normally best modelled by a birth-death master 
equation formalism (as in Chap. 7), approximate treatments are often given by 
means of a FPE. The reaction 


X+A=—2X (5.2.57) 


is of interest since it possesses an exit boundary at x = O (where x ts the number 
of molecules of X). Clearly if there is no X, a collision between X and A cannot 
occur so no more X Is produced. 

The FPE is derived in Sect.7.6.1 and 1s 


0,p(x, t) = —A,[(ax — x*)p(x, t)] + 5 O(ax + x*)p(x, t)) - (5.2.58) 


We introduce reflecting boundaries at x = a and x = f. In this case, the stationary 
solution 1s 


p(x) = e*(a + xe xT! (5.2.59) 


which is not normalisable if a = 0. The pole at x = 0 is a result of the absorption 
there. In fact, comparing with (5.2.28), we see that 


BO, t) = (ax + x*),-9 = 0 
A(O, t) = (ax — x*),<9 = 0 (5.2.60) 
0,B(0,t) = (a+ 2x),29 > 0 


so we indeed have an exit boundary. The stationary solution has relevance only 
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if a > Osince it is otherwise not normalisable. The physical meaning of a reflecting 
barrier is quite simple: whenever a molecule of X disappears, we simply add 
another one immediately. A plot of p,(x) is given in Fig. 5.2. The time for all x to 
disappear is in practice extraordinarily long, and the stationary solution (5.2.59) 
IS, In practice, a good representation of the distribution except near x = 0. 


Ps (x) 


Fig. 5.2. Non-normalisable “‘stationary”’ p,(x) for the 
reaction X + A =—2X 
X 


5.2.4 Boundary Conditions for the Backward Fokker-Planck Equation 

We suppose that p(x, t|x’, t') obeys the forward Fokker-Planck equation for a set 
of x,t and x’, t’, and that the procegs 1s confined to a region R with boundary S. 
Then, if s is a time between f¢ and ¢’, 


re] ’ 0 f47 
0 = 5 p(x, t|x', tt) = a. J dyp(x, tly, s)p(y, |x’, t’), (5.2.61) 


where we have used the Chapman-Kolmogorov equation. We take the derivative 
d/ds inside the integral, use the forward Fokker-Planck equation for the second 
factor and the backward equation for the first factor. For brevity, let us write 


P(y, 5) = P(y, 5|’, t’) (5.2.62) 
D(y, 5s) = p(x, tly, s). 


Then, 
0 = [dy|— Sx (Ap) + Da Bu) |p (5.2.63) 
R 7 OY: ij Oy Oy, : 
+ [dy|— Ax — DB we > |P 
R *s i,j ss ay,< 
and after some manipulation 


=fay> 5 |—4 |— App + > DAL a Byp) — pBy 5° | (5.2.64) 
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feds. [p]—ap +55 5 Bue) 


ap 
-+[ Sas, (3: By | | (5.2.65) 


S 


We now treat the various cases individually. 


a) Absorbing Boundaries 

This requires p = 0 on the boundary. That it also requires p(y, tf) = 0 on the boun- 
dary 1s easily seen to be consistent with (5.2.65) since on substituting p = 0 in that 
equation, we get 


= {p D dS Bi; > (5.2.66) 
Ky yj 


However, if the boundary is absorbing, clearly 
p(x, tly, s) = 0, for y © boundary (5.2.67) 


since this merely states that the probability of X re-entering R from the boundary 
IS Zero. 


b) Reflecting Boundaries 
Here the condition on the forward equation makes the first integral vanish in 
(5.2.65). The final factor vanishes for arbitrary p only if 


0 
De n,B;(y) Aa [p(x, tly, s)] = 0. (5.2.68) 
i,j yj 
In one dimension this reduces to 
a (x, t| )=0 (5.2.69) 
5y? ,tly,s)= 2 


unless B vanishes. 


c) Other Boundaries 
We shall not consider these in this section. For further details see [5.4]. 


5.2.5 Eigenfunction Methods (Homogeneous Processes) 


We shall now show how, in the case of homogeneous processes, solutions can most 
naturally be expressed in terms of eigenfunctions. We consider reflecting and 
absorbing boundaries. 


a) Eigenfunctions for Reflecting Boundaries 


We consider a Fokker-Planck equation for a process on an interval (a, 6) with 
reflecting boundaries. We suppose the FPE to have a stationary solution p,(x) and 
the form 
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0, p(x, t) = —d,[A(x)p(x, t)] + 4 02 [B(x)p(x, 1]. (5.2.70) 
We define a function g(x, t) by 

P(x, t) = p.(x)q(x, t) (5.2.71) 
and, by direct substitution, find that q(x, t) satisfies the backward equation 

0,q(x, t) = A(x)0.q(x, t) + 4 B(x)dzq(x, ft). (5.2.72) 
We now wish to consider solutions of the form 

p(x, t) = Py(xje™*" (5.2.73) 

q(x, t) =Q,(x)e™*" (5.2.74) 
which obey the eigenfunction equations 

—0,[A(x)Px(x)] + § O2[B(x)P,(x)] = —APi) (5.2.75) 

A(x)0,O,(x) + 4 B(x)020,(x) = —A'Q,Ax) . (5.2.76) 


Then we can straightforwardly show by partial integration that 
. ¢ 
A’ — a) f dxP,(x)Qi(%) = [Qu {— A) Pa) + 4 01BO)PA))} 


— 4 Bx) P,(x)0,0.0(X)]:, (5.2.77) 


and using the reflecting boundary condition on the coefficient of Q,,(x), we see that 
it vanishes. Further, using the definition of g(x, t) in terms of the stationary solution 
(5.2.71), it is simple to show that 


t B(x)0,Oi(x) = —A(x)Pax) + 5 01BO)Pu)] (5.2.78) 


so that term vanishes also. Hence, the Q,(x) and P,(x) form a bi-orthogonal system 
5 
f dx P,(x)Q,(x) — Say (5.2.79) 
or, there are two alternative orthogonality systems, 


fbx pO XOX) = Bay (5.2.80) 


fdxt p(x) P(x) Pu) = Say : (5.2.81) 
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It should be noted that setting 2 = A’ = 0 gives the normalisation of the stationary 
solution p,(x) since 


P(x) = p(x) (5.2.82) 
Q(x) = 1. (5.2.83) 


Using this orthogonality we can write any solution in terms of eigenfunctions. For if 


p(x, t) = pa A,P,(x)e~* ; (5.2.84) 
then 3 
f dx Q,(x)p(x, 0) = A,. (5.2.85) 


For example, the conditional probability p(x, t|xo, 0) 1s given by the initial con- 
dition 


p(x, 0| Xo, 0) = 8(x — xq) (5.2.86) 
so that 

Ay = f dx Q,(0)8(x — x0) = 1%) (5.2.87) 
and hence, 

P(x, t| Xo, 0) = 2 P(x)O,(x)e™ . (5.2.88) 


We can write the autocorrelation function quite elegantly as 


(x(t)x(0)) = f dx f dxo xxop(x, t| Xo, 0)ps(x) (5.2.89) 
= ail [ dx xP,(x)Pe™, (5.2.90) 


where we have used the definition of Q,(x) by (5.2.74). 


b) Eigenfunctions for Absorbing Boundaries 

This is very similar. We define P, and Q, as above, except that p,(x) is still the sta- 
tionary solution of the Fokker-Planck equation with reflecting boundary conditions. 
With this definition, we find that we must have 


P,(a) = Q,(a) = P,(b) = Q,(6) = 0 (5.2.91) 
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and the orthogonality proof still follows through. Eigenfunctions are then com- 
puted using this condition and the eigenfunction equations (5.2.75, 76) and all other 


results look the same. However, the range of A does not include 4 = 0, and hence 
p(x, t| Xo, 0) + 0 as t+ ow. 


5.2.6 Examples 


a) A Wiener Process with Absorbing Boundaries 
The Fokker-Planck equation 


0p =} 0p (5.2.92) 
it treated on the interval (0, 1). The absorbing boundary condition requires 
pO, t) = p(l, t) = 0 (5.2.93) 


and the appropriate eigenfunctions are sin (mmx) so we expand in a Fourier sine 
series 


p(x, t) = 3° b,(t) sin(amx) (5.2.94) 
n=1 
which automatically satisfies (5.2.93). The initial condition is chosen so that 


p(x, 0) = d(x — Xp) (5.2.95) 


for which the Fourier coefficients ate 
1 
b,(0) = 2 f dx (x — xq) sin (nnx) = 2 sin (nx) . (5.2.96) 
0 


Substituting the Fourier expansion (5.2.94) into (5.2.92) gives 


b(t) = —A,b,(t) (5.2.97) 
with 
A, = n’n?/2 (5.2.98) 


and the solution 
5,(t) = b,(O)exp(— 2,¢) . (5.2.99) 


So we have the solution [which by the initial condition (5.2.95) is for the conditional 
probability p(x, t| xo, 0)] 


p(x, t|Xo, 0) = 2 > exp(— 4,f) sin (mmx) sin (mtx) . (5.2.100) 
n=1 
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b) Wiener Process with Reflecting Boundaries 
Here the boundary condition reduces to [on the interval (0, 1)] 


0, p(O, t) = 0d, p(i,t) = 0 (5.2.101) 
and the eigenfunctions are now cos (mtx), so we make a Fourier cosine expansion 


p(x, t) = + ay + > a,(t) cos (ntx) (5.2.102) 


with the same initial condition 


P(x, 0) = 8% — Xo) (5.2.103) 
so that 
1 
a,(0) = 2 [ dx cos (nnx)5(x — Xo) = 2cos (ntxp) . (5.2.104) 
0 


In the same way as before, we find 


a,(t) = a,(0) exp (— 4,t) (5.2.105) 
with 

A, = n'n?/2 (5.2.106) 
so that 

p(x, t| xo, 0) = 1+ pe COs (HTX) Cos (ntx) exp (—A,f). (5.2.107) 


As t — o9, the process becomes stationary, with stationary distribution 


psx) = lim P(x, t|xXo, 0) = 1. (5.2.108) 
We can compute the stationary autocorrelation function by 

<x(t)x(0)>, = jf dXo XX p(x, t| Xo, 0)p.(x) (5.2.109) 
and carrying out the integrals explicitly, 

(x(t)x(0)), = + + & D3 exp(— Zapeit) (Qn + 174. (5.2.110) 


We see that as t — oo, all the exponentials vanish and 


(x(t) x(0)>; + 4 = [x] (5.2.111) 
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and as t — 0, 


(x(1)x(0)>. = 4 + 3 


P| 
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. 
wd 


(5.2.113) 


when one takes account of the identity (from the theory of the Riemann zeta-func- 
tion) 


oo _ 4 
2 (2n + IY" = 96. (5.2.114) 


c) Ornstein-Uhlenbeck Process 
As in Sect.3.8.4, the Fokker-Planck equation is 


0, p(x, t) = 0,[kxp(x, t)] + 4 Dozp(x, t) . (5.2.115) 
The eigenfunction equation for Q, is 


2k 2 
d20, —~ d.0. + Fi =0 (5.2.116) 


and this becomes the differential equation for Hermite polynomials [5.6] on making 
the replacement 


y = X»/k/D 
d7O, — 2ydyQ, + (2a/k)Q, = &. (5.2.117) 


We can write 
QO, = (27n!)-17H,(x/k/D) 5 (5.2.118) 


where 


1 =nk, (5.2.119) 


and these solutions are normalised as in (5.2.79-81). 
The stationary solution is, as previously found, 


px) = (k/nD)!'"exp (— kx?/D) (5.2.120) 
and a general solution can be written as 

P(x, t) = 2 /[k/(Q?n!nD)] exp (—kx"/D)H,(x/k/ De A, (5.2.121) 
with 


Va f dx p(x, O)H,(x/k/D)2°n!)7"? (5.2.122) 
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The result can also be obtained directly from the explicit solution for the condi- 
tional probability given in Sect.3.8.4 by using generating functions for Hermite 
polynomials. One sees that the time scale of relaxation to the stationary state is 
given by the eigenvalues 


1, = nk. (5.2.123) 


Here, k is the rate constant for deterministic relaxation, and it thus determines the 
slowest time in the relaxation. One can also compute the autocorrelation function 
directly using (5.2.90). We use the result [5.6] that 


H,(y) = 2y (5.2.124) 


so that the orthogonality property means that only the eigenfunction corresponding 
to n = | has a nonzero coefficient. We must compute 


[x P,(x)dx = f K/QnD) exp (—kx?/D)\(2x/K]D)x (5.2.125) 
= /D/2k 
so that 
D tt 
(x(t) x0), = ape" (5.2.126) 


as found previously in Sects.3.8.4, 4.4.4. 


d) Rayleigh Process 


We take the model of amplitude fluctuations developed in Sect.4.4.5. The Fokker- 
Planck equation is 


0,p(x, t) = 0,[(yx — p/x)p(x, t)] + woz p(x, t), (5.2.127) 
where 
w= e/2. (5.2.128) 


The range here is (0, co) and the stationary solution (normalised) 

p(x) = (yx/p) exp (— yx?/2y) . (5.2.129) 
The eigenfunction equation for the Q,(x) is 

d20, + (I/x — yx/u)d.Q, + C/WQ, = 0. (5.2.130) 
By setting 


2 xy 2h; (5.2.131) 
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we obtain 

zd,’Q, + (1 — z)d,Q, + (A/2y)Q, = 0. (5.2.132) 
This is the differential equation for the Laguerre polynomials [5.6] provided 

A= 2ny. (5.2.133) 
We can write 

O,(x) = L,(yx?/2n) (5.2.134) 


which is normalised. Hence, the conditional probability is 


2 2 2 
P(x, t| Xo, 9) = > exp(— a (53) ber (5.2.135) 


We can compute the autocorrelation function by the method of (5.2.90): 


_ yx yx yx?) |? 

¢x(t)x(0)) = 3 | fx ax? ~ exp (<> 7 ‘)L,( al exp(— 2nyt) (5.2.136) 
and using 

( dz z¢e-*L,(z) = (—1)" at+ : 2. 

d L,(z) = (—1)"I( De (5.2.137) 

rt] 
we find for the autocorrelation funetion 

¢x(t)x(0)) = oe > = ey, exp(— 2nyt). (5.2.138) 


5.2.7 First Passage Times for Homogeneous Processes 


It is often of interest to know how long a particle whose position is described by 
a Fokker-Planck equation remains in a certain region of x. The solution of this 
problem can be achieved by use of the backward Fokker-Planck equation, as 
described in Sect. 3.6. 


a) Two Absorbing Barriers 
Let the particle be initially at x at time ¢f = O and let us ask how long it remains 
in the interval (a, b) which is assumed to contain x: 


a<x<b (5.2.139) 


We erect absorbing barriers at a and b so that the particle is removed from the 
system when it reaches a or b. Hence, if it is still in the interval (a, 5), it has never 
left that interval. 

Under these conditions, the probability that at time ¢ the particle is still in 
(a, b) is 
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f dx’pl a’, t|x, 0) = G(x, t). (5.2.140) 
Let the time that the particle leaves (a, b) be T. Then we can rewrite (5.2.140) as 

Prob(T > t) = f dx'p(x, t| x, 0) (5.2.141) 


which means that G(x, t) is the same as Prob(T 2 ft). Since the system is time 
homogeneous, we can write 


p(x’, t|x, 0) = P(x, O|x, —t) (5.2.142) 
and the backward Fokker-Planck equation can be written 

0, p(x’, t|x, 0) = A(x)d, p(x’, t|x, 0) + 4 B(x)o2 p(x’, t| x, 0) (5.2.143) 
and hence, G(x, t) obeys the equation 

0,G(x, t) = A(x)d,G(x, t) + 4 B(x)e2G(x, ft). (5.2.144) 


The boundary conditions are clearly that 
P(x’, O| x, 0) = 8 — x’) 
and hence, 


G(x, 0) = 1 a<x<b (5.2.145) 


==) elsewhere 
and if x = aor 5, the particle is absorbed immediately, so 


Prob(T > t) = 0 when x = aorb, 16s, 
G(a, t) = G(b, t) = 0. (5.2.146) 


Since G(x, t) is the probability that 7 > t, the mean of any function of T is 


(AT) = — [AAG(x, 1). (5.2.147) 
Thus, the mean first passage time 

T(x) = <T) (5.2.148) 
is given by 


T(x) = — § 1 ,G(x, t)dt (5.2.149) 
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= f G(x, t)dt (5.2.150) 


after integrating by parts. 
Similarly, defining 


T(x) = <T”) , (5.2.151) 
we find 
T(x) = ft? G(x, td? . (5.2.152) 


We can derive a simple ordinary differential equation for 7(x) by using (5.2.150) 
and integrating (5.2.144) over (0, 00). Noting that 


fa, G(x, t)dt = G(x, ~) — G(x, 0) = —1, (5.2.153) 
we derive 

A(x)0,T(x) + $ B(x)82T(x) = —1 (5.2.154) 
with the boundary condition $ 

T(a) = T(b) = 0. : (5.2.155) 


Similarly, we see that 
— nT,_,(x) = A(x)0,T,(x) + 4 B(x)02T,(x) (5.2.156) 


which means that all the moments of the first passage time can be found by repeated 
integration. 


Solutions of the Equations. Equation (5.2.154) can be solved directly by integration. 
The solution, after some manipulation, can be written in terms of 


ee exp |f dx'(2A(x'/ B(x) | (5.2.157) 
We find 
¢ dy \e : dzy(z) dy - dzy(z) 
el Ala) leon! sa ~ Vee) loon! ae | siohvis 


¢ dy 
J w(y) 
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b) One Absorbing Barrier 
We consider motion still in the interval (a, b) but suppose the barrier at a to be re- 
flecting. The boundary conditions then become 


0,G(a, t)=0 (5.2.159a) 
G(b, t) =0 (5.2.159b) 


which follow from the conditions on the backward Fokker-Planck equation 
derived in Sect.5.2.4. We solve (5.2.154) with the corresponding boundary condi- 
tion and obtain 


a_ reflecting 


- dy tw(z) ; 
T(x) =2{— | — dz 6b absorbin 5.2.160 
= 2TToplB@~ > absorbing are 
Similarly, one finds 
v(z) () 5b reflecting 
a absorbin 5.2.16] 
=2] aha B(z ie huey a ( 


c) Application—Escape Over a Potential Barrier 
We suppose that a point moves according to the Fokker-Planck equation 


0, p(x, t) = 0,[U'(x)p(x, t)] + Dd2 p(x, t). (5.2.162) 


The potential has maxima and minima, as shown in Fig. 5.3. We suppose that 
motion is on an infinite range, which means the stationary solution is 


px) = SM exp [— U(x)/D] (5.2.163) 


which is bimodal (as shown in Fig. 5.3) so that there is a relatively high probability 
of being on the left or the right of b, but not near b. What is the mean escape time 
from the left hand well? By this we mean, what is the mean first passage time from 
ato x, where x is in the vicinity of b? We use (5.2.160) with the substitutions 

b oe Xo 

a— —oo (5.2.164) 


xXx—-+>@ 


so that 


T(a— Xo) = _ i dy exp [U(y)/D] iG exp[— U(z)/D]dz. (5.2.165) 
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(a) (b) 
U(x) Ps (x) 
x xX 
(c) 
Tla-x) 


Fig. 5.3. (a) Double well potential U(x); 
(b) Stationary distribution p,(x); 
(c) Mean first passage time from a to x, T(a > Xp) 


If the central maximum of U(x) is large and D is small, then exp [U(y)/D] is sharply 
peaked at x = 5b, while exp[—U(z)/D] is very small near z= b. Therefore, 
J”... exp [—U(z)/D]dz is a very slowly varying function of y near y = b. This means 
that the value of the integral a exp[— U(z)/D]dz will be approximately constant 
for those values of y which yield a value of exp [U(y)/D] which is significantly 
different from zero. Hence, in the inner integral, we can set y = b and remove 
the resulting constant factor from inside the integral with respect to y. Hence, 
we can approximate (5.2.165) by 


] b xO 
(a — x,) ~ Fe f dy exp[— u(2)/D)| [dy exp [U(y)/D]. (5.2.166) 
Notice that by the definition of p,(x) in (5.2.163), we can say that 
b 
{ dy exp[— U(z)/D] = 1,//% (5.2.167) 


which means that n, is the probability that the particle is to the left of b when the 
system is Stationary. 

A plot of T(a — xo) against x9 is Shown in Fig. 5.3 and shows that the mean first 
passage time to Xp IS quite small for x, in the left well and quite large for x, in the 
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right well. This means that the particle, in going over the barrier to the right well, 
takes most of the time in actually surmounting the barrier. It is quite meaningful 
to talk of the escape time as that time for the particle, initially at a, to reach a point 
near c, since this time is quite insensitive to the exact location of the initial and 
final points. We can evaluate this by further assuming that near b we can write 


ne 2 
U(x) = u(b) — + S| (5.2.168) 
2\ 6 
and near a 
1 $e -a\* 
U(x) ~ U(a) + al (5.2.169) 
The constant factor in (5.2.166) 1s evaluated as 
b oo U(a) (z aos | 
ns dz exp[—U(z)/D] ~ i dz exp | D ~ Da (5.2.170) 
~ ar/2nD exp [— U(a)/D] (5.2.171) 


and the inner factor becomes, on assuming X, 1S well to the right of the central point 
b, 


i dy exp U(y)/D ~ f dy exp Pe -- Or (5.2.172) 
= 6./2nD exp [U(b)/D]. (5.2.173) 


Putting both of these in (5.2.166), we get 
T(a — X9) = 2adn exp {[U(b) — U(a)]/D}. (5.2.174) 


This is the classical Arrhenius formula of chemical reaction theory. In a chemical 
reaction, we can model the reaction by introducing a coordinate such that x = a 
is species A and x = c is species C. The reaction is modelled by the above diffusion 
process and the two distinct chemical species are separated by the potential barrier 
at b. In the chemical reaction, statistical mechanics gives the value 


D=kT, (5.2.175) 


where k is Boltzmann’s constant and T is the absolute temperature. We see that the 
most important dependence on temperature comes from the exponential factor 
which is often written 


exp (AE/KT) (5.2.176) 


and predicts a very characteristic dependence on temperature. Intuitively, the 
answer is obvious. The exponential factor represents the probability that the energy 
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will exceed that of the barrier when the system is in thermal equilibrium. Those 
molecules that reach this energy then react, with a certain finite probability. 
We will come back to problems like this in great detail in Chap.9. 


5.2.8 Probability of Exit Through a Particular End of the Interval 


What is the probability that the particle, initially at x in (a, 5), exits through a, 
and what is the mean exit time? 

The total probability that the particle exits through a after time f¢ is given by 
the time integral of the probability current at a. We thus define this probability by 


g(x, t) = —f dt’ (a, t’|x, 0) (5.2.177) 
= ar’ {— A(a)p(a, t’| x, 0) + $0,[B(a)p(a, t'| x, 0)}} (5.2.178) 


(the negative sign is chosen since we need the current pointing to the left) and 
g(x, t) = J dt’ {A(b)p(b, t’| x, 0) — $0,[B(b)p(b, t’ |x, 0)]} : (5.2.179) 
t 


These quantities give the probabilities that the particle exits through a or b after 
time ¢, respectively. The probability that (given that it exits through a) it exits after 
time f 1s 


Prob(T, > t) = g.(x, t)/ga(x, 0}. (5.2.180) 


* 


We now find an equation for g,(x, t). We use the fact that p(a, t|x, 0) satisfies a 
backward Fokker-Planck equation. Thus, 


A(x)0,2(X, t) + $B(x)d2g,(x, t) = — Jdt’d,J(a, t’|x, 0) 


= J(a, t|x, 0) 
= 0.¢,(x, t). (5.2.181) 


The mean exit time, given that exit is through a, is 


T (a, x) = _{ 14,Prob (T, >t) dt = f galx, t)dt/g(x, 0). (5.2.182) 
0 } 


Simply integrating (5.2.181) with respect to ft, we get 


| A(x)0,[ta(x)T(a, x)] + 4B(x)0%[2.(x)T(a, x)] = —72.(x) , | (5.2.183) 


where we define 


(xX) = (probability of exit through a) = g,(x, 0) . (5.2.184) 
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The boundary conditions on (5.2.183) are quite straightforward since they follow 
from those for the backward Fokker-Planck equation, namely, 


1,(a)T(a, a) = 2,(b)T(a, b) = 0. (5.2.185) 


In the first of these clearly 7(a, a) is zero (the time to reach a from a is zero) and in 
the second, z,(b) is zero (the probability of exiting through a, starting from J, is 
zero). 

By letting f->0 in (5.2.181), we see that J(a,0|x,0) must vanish if a@#-x. since. 
p(a,0|x,0) = d(x—a). Hence, the right-hand side tends to zero and we get 


| A(x)0,1 (x) + 4B(x)d22,(x) = 0, | (5.2.186) 


the boundary condition this time being 


(5.2.187) 


The solution of (5.2.186) subject to this boundary condition and the condition 


TAX) + 1(x) = | (5.2.188) 

is 
n(x) =[f dy vO f dy vO) (5.2.189) 
7(x) = [ f dy w(y)]/ f dy w(y) (5.2.190) 


with w(x) as defined in (5.2.157). 
These formulae find application in the problem of relaxation of a distribution 
initially concentrated at an unstable stationary point (Sect.9.1.4). 


5.3. Fokker-Planck Equations in Several Dimensions 


In many variable situations, Fokker-Planck equations take on an essentially more 
complex range of behaviour than is possible in the case of one variable. Boundaries 
are no longer simple end points of a line but rather curves or surfaces, and the 
nature of the boundary can change from place to place. Stationary solutions even 
with reflecting boundaries can correspond to nonzero probability currents and 
eigenfunction methods are no longer so simple. 
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Nevertheless, the analogies between one and many dimensions are useful, and 
this section will follow the same general outline as that on one-variable situations. 


5.3.1 Change of Variables 


Suppose we have a Fokker-Planck equation in variable x,, 


0,p(x, t) = — > 0,[A(x)p(x, t) + 4 2 0,0 ,;{B,,(x)p(x, t)] (5.3.1) 
and we want to know the corresponding equation for the variables 


yi = filx), (5.3.2) 


where f, are certain differentiable independent functions. Let us denote by f(y, f) 
the probability density for the new variable, which is given by 


B(y, t) = p(x, t) 


O(x1, X2 ---) : (5.3.3) 


OV, Va --) 


The simplest way to effect the change of variables is to use Ito’s formula on the 
corresponding SDE 


dx(t) = A(x)dt + / B(x) dW(t) (5.3.4) 


and then recompute the corresponding FPE for f(y, t) from the resulting SDE as 
derived in Sect. 4.3.4. 4 

The result is rather complicated. In specific situations, direct implementation 
(5.3.3) may be preferable. There is no way of avoiding a rather messy calculation 
unless full use of symmetries and simplifications is made. 


Example: Cartesian to Polar Coordinates. As an example, one can consider the 
transformation to polar coordinates of the Rayleigh process, previously done by 
the SDE method in Sect.4.4.5. Thus, the Fokker-Planck equation is 


0 0 | Q” 7 
0, P(E), E2, t) = Y OE, E\p + ) 5p, &2P oF a & aE: 7H (5.3.5) 


and we want to find the FPE for a and ¢ defined by 


E, = acos¢ (5.3.6) 
E,=asing. 
The Jacobian is 
J _ O£,, £2) _|cosg¢ —asing 
~ @(a,d) {sing a cos ¢ 
=a. (5.3.7) 
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We use the polar form of the Laplacian to write 


oO oO 1 0? l o 7) 
a6i + aEI~ Bag t a da(“aal ie 


and inverting (5.3.6) 


he wer. | (5.3.9) 


¢= tan“"(E,/E,) , 
we note 


OD aoa es 
OE, J/ Et + Es 
Similarly, (5.3.10) 


cos ¢. 


Similarly, (5.3.11) 


og 


3E, = — sin g/a. 


Hence, 


0 0 
OE, aE,” 


_ ap Ba, ap 96), p, (20 20, 36) 
aR Es oe ap ly) +“? \aa dE, * 36 0E, 
dp _l1d4a 2 
da wore P). 


Epo an 


= 2p +a? (5.3.12) 


Let us use the symbol f(a, ¢) for the density function in terms of a and ¢g. The Jaco- 
bian formula (5.5.3) tells us that 


OE, i) E>) 


Bla, §) = |S 


P(E\, E,) = ap(£y, E2) . (5.3.13) 


Putting together (5.3.5, 8, 12, 13), we get 


ol (mat Fla] +$ (“54 + 54) (5.3.14) 
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which (of course) is the FPE, corresponding to the two SDE’s in Sect.4.4.5, which 
were derived by changing variables according to Ito’s formula. 


5.3.2 Boundary Conditions 


We have already touched on boundary conditions in general in Sect.5.2.1 where 
they were considered in terms of probability current. The full range of boundary 
conditions for an arbitrary multidimensional Fokker-Planck equation does not 
seem to have been specified yet. In this book we shall therefore consider mostly 
reflecting barrier boundary conditions at a surface S, namely, 


n.J=0 onS, (5.3.15) 


where n is the normal to the surface and 


J (x, t) = Ax, t)p(x, t) — > 2 ro B; (x, t)p(x, t) (5.3.16) 


and absorbing barrier boundary conditions 
p(x, t) = 0 forxonS. (5.3.17) 


In practice, some part of the surface may be reflecting and another absorbing. 
At a surface S on which A, or B,, age discontinuous, we enforce 


n-J, =1n-J, on § “ (5.3.18) 
Pi(x) = p(x) xons. 


The tangential current component is permitted to be discontinuous. 
The boundary conditions on the backward equation have already been derived 
in Sect.5.2.4. For completeness, they are 


Absorbing Boundary p(x, t\|y, t’) = 0 yes (5.3.19) 


Reflecting Boundary > ‘\n,B,,(y) 5 Pte, tly, t'‘) =0 yes. (5.3.20) 
ij J 


5.3.3 Stationary Solutions: Potential Conditions 


A large class of interesting systems is described by Fokker-Planck equations which 
permit a stationary distribution for which the probability current vanishes for 
all x in R. Assuming this to be the case, by rearranging the definition of J (5.3.16), 
we obtain a completely equivalent equation 


7S Bula) BX = pi(x)| Ade) —  Dge Bule)]. (5.3.21) 


Ox, 
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If the matrix B,,(x) has an inverse for all x, we can rewrite (5.3.21) 


slow [rsa] = 5 Bate) [24K2) — 32 Ba(x)| (5.3.22) 
= ZA, B, x]. (5.3.23) 


This equation cannot be satisfied for arbitrary B,,(x) and A,(x) since the left-hand 
side is explicitly a gradient. Hence, Z,; must also be a gradient, and a necessary and 
sufficient condition for that is the vanishing of the curl, i.e., 


OZ, _ OZ, 
ae Ox (5.3.24) 


If this condition is satisfied, the stationary solution can be obtained by simple 
integration of (5.3.22): 


p(x) = exp{f dx'-Z[A, B, x']}. (5.3.25) 


The conditions (5.3.24) are known as potential conditions since we derive the quan- 
tities Z, from derivatives of log [p,(x)], which, therefore, is often thought of as a 
potential —¢(x) so that more precisely, 


p(x) = exp [—¢(x)] (5.3.26) 
and 
d(x) = —f dx'-Z[A, B, x']. (5.3.27) 


Example: Rayleigh Process in Polar Coordinates. From (5.3.14) we find 


—ya + ¢*/2a 
pe | eee | (5.3.28) 
0 
E? 0 
B -| | (5.3.29) 
0 e*/a? 
from which 
ae: -y lp = 0 (5.3.30) 
7 Ox, me 7 Ox, $j 
so that 
—2ya/e? + 1 
Z=2B A= an - “ (5.3.31) 
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and clearly 


a = oe = 9 (5.3.32) 


The stationary solution is then 


pila, $) = expl J (db Zp + da Z,) (5.3.33) 
= W exp [- id + log a| (5.3.34) 
= Waexp (— ie : (5.3.35) 


5.3.4 Detailed Balance 


a) Definition of Detailed Balance 

The fact that the stationary solution of certain Fokker-Planck equations corres- 
ponds to a vanishing probability current is a particular version of the physical 
phenomenon of detailed balance. A Markov process satisfies detailed balance if, 
roughly speaking, in the stationary situation each possible transition balances 
with the reversed transition. The concept of detailed balance comes from physics, 
so let us explain more precisely with a physical example. We consider a gas of 
particles with positions r and velqcities v. Then a transition corresponds to a 
particle at some time ¢ with position velocity (r,v) having acquired by a later 
time ¢ + t position and velocity (r’, v’). The probability density of this transition 
is the joint probability density p(r’, v', t + Tt; 1, VU, t). 

We may symbolically write this transition as 


(r,v,t)—(r',u'",t+ 7). (5.3.36) 


The reversed transition is not given simply by interchanging primed and unprimed 
quantities Rather, it is 


(r’, —v',t)—-(r, —v,t+T). (5.3.37) 


It corresponds to the time reversed transition and requires the velocities to be re- 
versed because the motion from r’ to r is in the opposite direction from that from 
rtor’. 

The probability density for the reversed transition is thus the joint probability 
density 


pir, —v,t+t3r,—U',t). (5.3.38) 


The principle of detailed balance requires the equality of these two joint probabilities 
when the system is in a stationary state. Thus, we may write 
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p.(r’, v', tT; r, v, 0) = p,(r, —v, T; r’, —v’, 0) (5.3.39) 


(The principle can be derived under certain conditions from the laws of physics, 
see [5.7] and Sect.5.3.6b.) 
More explicitly, for a Markov process we can rewrite (5.3.39) 


p(r', v', thr, v, O)p.(r, v) = p(r, — v, tT]r’, — vu’, O)p.(r’, — v’), (5.3.40) 


where the conditional probabilities now apply to the corresponding homogeneous 
Makov process (if the process was not Markov, the conditional probabilities would 
be for the stationary system only). 

In its general form, detailed balance 1s formulated in terms of arbitrary variables 
x,, which under timereyersal, transform to the reversed variables according to the rule 


xX, — 6X; (5.3.41) 
= +1 (5.3.42) 


depending on whether the variable is odd or even under time reversal. In the above, 
r is even, v is odd. 
Then by detailed balance we require 


p(x,t +7; x',t) = p.(ex’,t + 7; ex, t). (5.3.43) 


By ex, we mean (é).x;, €2X2. ...). 
Notice that setting t = O in (5.3.43) we obtain 


5(x — x’)p,(x’') = 5(ex — ex’)p,(ex) . (5.3.44) 
The two delta functions are equal since only sign changes are involved. Hence, 

p(x) = p,(ex) (5.3.45) 
is a consequence of the formulation of detailed balance by (5.3.43). Rewriting now 
in terms of conditional probabilities, we have 


p(x, T| x’, O)p,(x’) = p(ex’, tT| ex, 0)p,(x) . (5.3.46) 


b) General Consequences of Detailed Balance 
An important consequence of (5.3.45) is that 


(X), = &XX), (5.3.47) 


(hence all odd variables have zero stationary mean), and for the autocorrelation 
function 


G(t) = <x(t)x"(0))s 
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we have 
G(t) = e¢x(0)x"(t)),e , 


hence, 


| G(t) = eG (t)e€ | (5.3.48) 


and setting t = 0 and noting that the covariance matrix a satisfies 0 = a’, 


o& = £0. (5.3.49) 


For the spectrum matrix 
S(@) = Es ii e'°TG(t) 
Bites ; 
we find from (5.3.48) that 


S(@) = eST(o)e . (5.3.50) 


c) Situations in Which Detailed Balance must be Generalised 

It is possible that there exist several stationary solutions to a Markov process, 
and in this situation, a weaker form of detailed balance may hold, namely, instead 
of (5.3.43), we have ¢ 


p(x, t + 1; x’, t) = plex’, t + T; ex, t) (5.3.51) 


where the superscripts 1 and 2 refer to two different stationary solutions. Such a 
situation can exist if one of the variables is odd under time reversal, but does not 
change with time; for example, in a centrifuge the total angular momentum has this 
property. A constant magnetic field acts the same way. 

Mostly, one writes the detailed balance conditions in such situations as 


p(x, t +7; x',t) = pi(ex’, + 7; ex, t) (5.3.52) 


where 4 is a vector of such constant quantities, which change to ed under time re- 
versal. According to one point of view, such a situation does not represent detailed 
balance; since in a given stationary situation, the transitions do not balance in 
detail. It is perhaps better to call the property (5.3.52) time reversal invariance. 
In the remainder of our considerations, we shall mean by detailed balance the 
situation (5.3.45), since no strong consequences arise from the form (5.3.52). 


5.3.5 Consequences of Detailed Balance 


The formulation of detailed balance for the Fokker-Planck equation was done 
by van Kampen [5.7] and independently by Uhlhorn [5.8], and Graham and Haken 
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[5.9]. We will formulate the conditions in a slightly more direct and more general 
way. We want necessary and sufficient conditions on the drift and diffusion coeffi- 
cients and the jump probabilities for a homogeneous Markov process to have 
stationary solutions which satisfy detailed balance. We shall show that necessary 
and sufficient conditions are given by 


(i)  W(x|x')p,(x') = Wex' | ex)p,(x) 


(i) &Adex)ps(x) = Aap) + D - [B,,(x)p.(x)] (5.3.53) 


(iil) €,€,B,,(ex) = B,(x). 


The specialisation to a FPE is simply done by setting the jump probabilities W(x | x’) 
equal to zero. 


Necessary Conditions. It is simpler to formulate conditions for the differential Chap- 
man-Kolmogorov equation than to restrict ourselves to the Fokker-Planck 
equation. According to Sect. 3.4 which defines the quantities W(x|x’), A,(x) and 
B,,(x) (all of course being time independent, since we are considering homogeneous 
process), we have the trivial result that detailed balance requires, from (5.3.46) 


W(x|x')p,.(x') = Wex'|ex)p,(x) . (5.3.54) 
Consider now the drift coefficient. For simplicity write 
x =x4+6. (5.3.55) 


Then from (5.3.46) we have 


{ db 6,p(ex + 6, At|ex, 0)p,(x) 


1d1<K 


= | d665,p(x, At|x + 6, 0)p,(x + 6) (5.3.56) 
1\d1<K 


(we use K instead of € in the range of integration to avoid confusion with é,); divide 
by At and take the limit At — 0, and the left-hand side yields 


€;A,(ex)p.(x) + O(K). (5.3.57) 
On the right-hand side we write 
p(x + 6 — 6, At|x + 6, 0)p,(x + 6) = p(x — 6, At|x, 0)p,(x) (5.3.58) 


+354, = [p(x — 6, At|x, O)p,(x)] + O(8) 


so that the right-hand side is 
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| 7) 
lim 4; § 48 |6,p(x — 8, Atlx, Ope) + 34.5, 5-- [n(x —6, At|x,0)p,(x)]} + O(K) 
0 
= A(x) ps2) + 3.552 [Bu(x)P.(2)] + O(K), (5.3.59) 


where we have used the fact demonstrated in Sect. 3.4 that terms involving 
higher powers of 6 than 6’ are of order K. Letting K —- 0, we find 


e,Alex)px) = —A(a)pdx) + Di (BY(x)p.)1. (5.3.60) 


The condition on B,,(x) is obtained similarly, but in this case no term like the second 
on the right of (5.3.60) arises, since the principal term is O(67). We find 


€€,B,(ex) = B(x). (5.3.61) 


A third condition is, of course, that p,(x) be a stationary solution of the differential 
Chapman-Kolmogorov equation. This is not a trivial condition, and is, in general, 
independent of the others. 


Sufficient Conditions. We now show that (5.3.53) are sufficient. Assume that these 
conditions are satisfied, that p,(x) is a stationary solution of the differential Chap- 
man-Kolmogorov equation, and that p(x, t|x’, 0) is a solution of the differential 
Chapman-Kolmogorov equation. We now consider a quantity 


p(x, t|x’, 0) = plex’, t|ex, 0)p.(¥)/p.(x’) . (5.3.62) 
Clearly 
p(x, 0O|x’, 0) = &(x — x’) = p(x, 0| x’, 0). (5.3.63) 


We substitute f into the differential Chapman-Kolmogorov equation and show 
that because p(x’, t|x,0) obeys the backward differential Chapman-Kolmogorov 
equation in the variable x, the quantity p is a solution of the forward differential 
Chapman-Kolmogorov equation. 

We do this explicitly. The notation is abbreviated for clarity, so that we write 


p for p(x, t|x’, 0) 
ps for p(x) 

P, for p(x’) 

p(x) for p(x’, t|x, 0). 


(5.3.64) 


We proceed term by term. 
1) Drift Term: 
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= =o Fea [A,p,]p(ex) + APaze p(ex)t |p,’ . 


11) Diffusion Term: 


> a (Bb) = +B grax Bupdples) 
0? 
2 ee (Bi, Ps) see Pes) + Bis aan plex) [pe (5.3.66) 


1) Jump Term: 


fdz[W(x | z)p(z, t1x’, 0) — W(z|x)p(x, t| x’, 0)] 
= | dz[W(x|z)p,(z)p(ex’, t|ez, 0) — W(z|x)p,(x)p(ex’, t|&x,0)]/p, . (5.3.67) 


We now use the fact that p,(x) is a solution of the stationary differential Chapman- 
Kolmogorov equation to write 


3] ge Ape) + > Eigse (Bupo)| — f de Wee )p) 
= —f dz W(x|z)p,(z) (5.3.68) 
and using the detailed balance condition (5.3.53(i)) for W 
= —f dz W(ez|ex)p(x). (5.3.69) 
Now substitute 


y= ex (5.3.70) 


and all up all three contributions, taking care of (5.3.68, 69): 
= a é,A ev rea(y) |= a + 26 fay = [ButerdP.v)]| 5 -At)| 


shee Bley Pu») | a r()| (5.3.71) 


dy,0y, 
+ f dz[W(ey|z)p.(z)p (9, t|éz,0) — Wley|z)p.(z)p(y’, tly, oy [p.(y’)- 


We now substitute the detailed balance conditions (5.3.53). 


a, 1 a 
= {5 A) 3 Uv, £19, 0) + 7D Bul) 5a POWs 19,0) 


+ J dz[W(z|y)p(s’,t|z,0) — Wzly)p(y’, tly, 0)} p.(y)/p.(y’) - (5.3.72) 
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The term in the large curly brackets is now recognisable as the backward differential 
Chapman-Kolmogorov operator (Sect.3.6, (3.6.4)]. Note that the process is homo- 
geneous, so that 


P(y’, tly, 0) = ply, Oly, —f). 


We see that 


(5.3.72) = S [oly', 11, Op v)/ps(v] = & lx, 12’, 0) (5.3.73) 


which means that f(x, t|x’, 0), defined in (5.3.62), satisfies the forward differential 
Chapman-Kolmogorov equation. Since the initial condition of p(x, t|x’, 0) and 
p(x, t|x’, 0) at t = 0 are the same (5.3.63) and the solutions are unique, we have 
shown that provided the detailed balance conditions (5.3.53) are satisfied, detailed 
balance is satisfied. Hence, sufficiency is shown. 


Comments 
1) Even variables only: the conditions are considerably simpler if all ¢, are +1. In 
this case, the conditions reduce to 


W(x |x" px’) = W(x |x)p(2) (5.3.74) 
Ads)pe) = Di jee (Bulge) (5.3.75) 


B, (x) = B,,(x), (5.3.76) 


the last of which is trivial. The condition (5.3.75) is exactly the same as the potential 
condition (5.3.21) which expresses the vanishing of J, the probability current in the 
stationary state. 

The conditions (5.3.74, 75) taken together imply that p,(x) satisfies the stationary 
differential Chapman-Kolmogorov equation, which is not the case for the general 
conditions (5.3.53). 


11) Fokker-Planck equations: van Kampen, [5.7], and Graham and Haken [5.9] in- 
troduced the concept of reversible and irreversible drift parts. The irreversible 
drift is 


Dx) = } (A(x) + &,A(ex)] (5.3.77) 
and the reversible drift 
I(x) = }[A(x) — €,A,(ex]. (5.3.78) 


Using again the potential defined by 


p(x) = exp [—¢(x)], (5.3.79) 


~~ a ee we me eee ee me eee eS ee eee 


we see that in the case of a Fokker-Planck equation, we can write the conditions for 
detailed balance as 


€,€,B,,(ex) = B,,(x) (5.3.80) 
DAs) — 3 Digg (B@)] = —y EBs) (5.3.81) 
'e I(x) — I(x) ad =) (5.3.82) 


where the last equation is simply the stationary FPE for p,(x), after substituting 
(3.3.53(1)). As was the case for the potential conditions, it can be seen that (5.3.81) 
gives an equation for 0g/dx, which can only be satisfied provided certain conditions 
on D(x) and B,,(x) are satisfied. If B,,(x) has an inverse, these take the form 


a = ay (5.3.83) 
where 

Z, = >. Bi(z) [2D(x) = a B, (2) (5.3.84) 
and we have 

p,(x) = exp[—¢(x)] = exp (f dx’-2). (5.3.85) 


Thus, as in the case of a vanishing probability current, p,(x) can be determined 
explicitly as an integral. 


111) Connection between backward and forward operators of differential Chapman- 
Kolmogorov equations is provided by the detailed balance. The proof of sufficient 
conditions amounts to showing that if f(x, ¢) is a solution of the forward differential 
Chapman-Kolmogorov equation, then 


f(x, t) = flex, — t)/p,(x) (5.3.86) 


is a solution of the backward differential Chapman-Kolmogorov equation. This 
relationship will be used in Sect.5.3.7 for the construction of eigenfunctions. 


5.3.6 Examples of Detailed Balance in Fokker-Planck Equations 


a) Kramers’ Equation for Brownian Motion [5.10] 

We take the motion of a particle in a fluctuating environment. The motion Is in one 
dimension and the state of the particle is described by its position x and velocity v. 
This gives the differential equations 
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a = (5.3.87) 
and 
mae = —V'(x) — Bo + VS TBKT Et) (5.3.88) 


which are essentially Langevin’s equations (1.2.14) in which for brevity, we write 
6nxya = B 


and V(x) is a potential whose gradient V’(x) gives rise to a force on the particle. 
By making the assumption that the physical fluctuating force €(t) is to be interpreted 
as 


E(t)dt = dW(t) (5.3.89) 
as explained in Sect.4.1, we obtain SDE’s 

dx =v dt (5.3.90) 

m dv = —[V'(x) + Bu] dt + /2BkT dWi(t) (5.3.91) 
for which the corresponding FPE is 


8 - — 2 (apy + 12 ve + pop) + ORE EE (5.3.92) 


The equation can be slightly simplified by introducing new scaled variables 


== X/mkT (5.3.93) 
u = V/m|kT (5.3.94) 
U(y) = V(x)/kT (5.3.95) 
y = Bim (5.3.96) 


so that the FPE takes the form 
a oy a Op 
Ff = — zy) + 5, U'OWPl + 75, [ur + 5r] (5.3.97) 


which we shall call Kramers’ equation. 
Here, y (the position) is an even variable and u (the velocity) an odd variable, as 
explained in Sect.5.3.4. The drift and diffusion can be written 
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A(y, u) = _ U'(y) — 7 (5.3.98) 
B(y, uv) = > . | (5.3.99) 
Q 2y 
and 
|”, = ? | (5.3.100) 
u == 


We can check the conditions one by one. 
The condition (5.3.53(iii)) is trivially satisfied. The condition (5.3.53(ii)) is 
somewhat degenerate, since B is not invertible. It can be written 


0 
eA(y, — u)p.(y, u) = —A(y, u)p.(y, u) + Op, (5.3.101) 
+ ou 
or, more fully 
7, —u 0 
iG =u Ps = iG ean Ps + 2y °P: (5.3.102) 
The first line is an identity and the second states 
—up,(y, u) = Pe (5.3.103) 
1.e., 
p(y, u) = exp (—}u’) f(y) (5.3.104) 


which means that if p,(y; uw) is written in the form (5.3.104), then the detailed 
balance conditions are satisfied. One must now check whether (5.3.104) indeed gives 
a stationary solution of Kramers’ equation (5.3.97) by substitution. The final brac- 
ket vanishes, leaving 


0= uz — U'(y)uf (5.3.105) 


which means 


f(y) = NM exp [— U(y)] (5.3.106) 


and 
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p(y, u) = M exp [—U(y) — hu’). (5.3.107) 


In terms of the original (x, v) variables, 
p(X, v) = V exp | ge i (5.3.108) 


which is the familiar Boltzmann distribution of statistical mechanics. Notice that 
the denominators kT arise from the assumed coefficient ./2BkT of the fluctuating 
force in (5.3.88). Thus, we take the macroscopic equations and add a fluctuating 
force, whose magnitude is fixed by the requirement that the solution be the Boltz- 
mann distribution corresponding to the temperature T. 

But we have also achieved exactly the right distribution function. This means 
that the assumption that Brownian motion ts described by a Markov process of the 
form (5.3.87, 88) must have considerable validity. 


b) Deterministic Motion 
Here we have B,,(x) and W(x|x’) equal to zero, so the detailed balance conditions 
are simply 


&,A,(ex) = —A,(x). (5.3.109) 


Since we are now dealing with a Liouville equation (Sect.3.5.3), the motion of a 
point whose coordinates are x is described by the ordinary differential equation 
¢ 


£ x(0) = Afx(t)] . ; (5.3.110) 


Suppose a solution of (5.3.110) which passes through the point y at t = 0 is 


qt, y] (5.3.111) 


which therefore satisfies 


ql0, y])=y. (5.3.112) 
Then the relation (5.3.109) implies that the reversed solution 
eq(—t, ey) (5.3.113) 


is also a solution of (5.3.110), and since 
eq(0, ey) = eey=y, (5.3.114) 
i.e., the initial conditions are the same, these solutions must be identical, 1.e., 
eq(—t, ey) = q(t, y). (5.3.115) 


Now the joint probability in the stationary state can be written as 
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p(x, t; x’, t') = J dy p(x, t; x’, t’; y, 0) 
= f dy d[x — g(t, y)l8[x’ — g(t’, y)]p.(y) (5.3.116) 


and 
p ex’, — t';ex,—t)= { dy dfex — q(—t, y)]6[ex’ — q(—t’, y)]p.(y). (5.3.117) 


Change the variables from y to ey and note that p,(y) = p,(ey), and dey = dy, so 
that 


(5.3.117) = f dy 8[x — eq(—t, ey)]d[x’ — eq(—0’" ey)]p.(y) (5.3.118) 


and using (5.3.115), 


= f dy &[x — g(t, y)]8{x’ — g(t’, y)Ip.(y) (5.3.119) 
= p(x, t; x’, t’). (5.3.120) 


Using the stationarity property, that p, depends only on the time difference, we 
see that detailed balance is satisfied. 

This direct proof is, of course, unnecessary since the original general proof is 
valid for this deterministic system. Furthermore, any system of deterministic first- 
order differential equations can be transformed into a Liouville equation, so this 
direct proof is in general unnecessary and it is included here merely as a matter of 
interest. 

However, it is important to give a brief summary of the philosophy behind 
this demonstration of detailed balance. In physical systems, which are where de- 
tailed balance is important, we often have an unbelievably large number of variables, 
of the order of 10° at least. These variables (say, momentum and velocity of the 
particles in a gas) are those which occur in the distribution function which obeys a 
Liouville equation for they follow deterministic equations of motion, like Newton’s 
laws of motion. 

It can be shown directly that, for appropriate forms of interaction, Newton’s 
laws obey the principle of microscopic reversibility which means that they can be put 
in the form (5.3.110), where A(x) obeys the reversibility condition (5.3.109). 

The macroscopically observable quantities 1n such a system are functions of 
these variables (for example, pressure, temperature, density of particles) and, by 
appropriate changes of variable, can be represented by the first few components of 
the vector x. 

Thus, we assume x can be written 


x = (a, £) (5.3.121) 


where the vector a represents the macroscopically observable quantities and £ is all 
the others. Then, in practice, we are interested in 


B(a, » 15 Ga, lo; 4s, ts; dex) 
= ff... f d€,, d#, ... p(x, ths X25 ta; Xs, ts5 -..)- (5.3.122) 
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From the microscopic reversibility, it follows from our reasoning above that p, and 
thus also pf, both obey the detailed balance conditions but, of course, f does not 
obey a Liouville equation. [fit turns out or can be proven that p obeys, to some degree 
approximation, a Markov equation of motion, then we must preserve the detailed 
balance property, which takes the same form for f as for p. In this sense, the condi- 
tion (5.3.43) for detailed balance may be said to be derived from microscopic re- 
versibility of the equations of motion. 


c) Ornstein-Uhlenbeck Process: Onsager Relations in Linear Systems 
Most systems in which detailed balance is of interest can be approximated by an 
Ornstein-Uhlenbeck process, i.e., this means we assume 


Ala) =D Ay, (5.3.123) 


B,(x) = By. (5.3.124) 


The detailed balance conditions are not trivial, however. Namely, 


7] 
2 (e€;Ay + Ay); = ps Bis ay log p,(x) (5.3.125) 


and 
€,€;By, = Bi, . ¢ (5.3.126) 


Equation (5.3.125) has the qualitative implication that p,(x)‘is a Gaussian since 
derivative of log p,(x) is linear in x. Furthermore, since the left-hand side contains 
no constant term, this Gaussian must have zero mean, hence, we can write 


P(x) = Vexp (— } xa 'x). (5.3.127) 


One can now substitute (5.3.127) in the stationary Fokker-Planck equation and re- 
arrange to obtain 


=o Ay; — $ pa Byjoi" ole pz 02 On Ay + 4 2 Cig Budiz')XEX; = 0 (5.3.128) 
i iJ f i ty 


(we have used the symmetry of the matrix og). The quadratic term vanishes if 
the symmetric part of its coefficient is zero. This condition may be written in 
matrix form as 


a 'A + Ato! = —a'Ba"' (5.3.129) 
or 
Ag + dA’ = —B. (5.3.130) 


The constant term also vanishes if (5.3.129) is satisfied. Equation (5.3.130) is, of 
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course, exactly that derived by SDE techniques in Sect. 4.4.6 (4.4.51) with the 
substitutions 


Amasetf (5.3.131) 
BBT- B. 


We can now write the detailed balance conditions in their most elegant form. We 
define the matrix é by 


€ = diag (€), €2, €3, ...) (5.3.132) 
and clearly 

ae (5.3.133) 
Then the conditions (5.3.125, 126) become in matrix notation 

eAe + A = —Ba! (5.3.134) 


eBe= B. (5.3.135) 


The potential condition (5.3.83) is simply equivalent to the symmetry of a. 
As noted in Sect.5.3.4 (5.3.49), detailed balance requires 


EG = CE. (5.3.136) 
Bearing this in mind, we take (5.3.130) 
Ao + cA' = —B 


and from (5.3.134) 


eAeo + Ac = —B (5.3.137) 
which yield 
eAta = cA!’ (5.3.138) 


and with (5.3.136) 
(Ac) = (Aa)'e. (5.3.139) 


These are the celebrated Onsager relations; Onsager, [5.11]; Casimir, [5.12]. The 
derivation closely follows van Kampen’s [5.6] work. 

The interpretation can be made simpler by introducing the phenomenological 
forces defined as the gradient-of the potential ¢ = log[p,(x)]: 


F(x) = —V¢(x) = a'x (5.3.140) 
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(in physics, ¢/kT is the entropy of the system). Because of the linear form of the 
Ax) [(5.3.123)], the exact equations of motion for <x) are 


é <x) = ACx) = AaF(Kx)). (5.3.141]) 


Thus, if the fluxes d/dt<x» are related linearly to the forces F(<x)) by a matrix L 
defined by 


L= Ao, (5.3.142) 
then (5.3.139) says 


eLe = 1? (5.3.143) 


Or 


Lye LL é, and e, of the same sign 
ij jt (; j | bs ) (5.3.144) 
Ly, = —L, (ce, and e, of different sign) . 
Notice also that 
eBe = B 
and : (5.3.145) 
&& =a A 
imply that B,, and o,, vanish if ¢, and ¢, have opposite signs. 
In the special case that all the e, have the same sign, we find that 
b= (5.3.146) 


and noting that, since a is symmetric and positive definite it has a real square root 
o'!2. we find that 


A = a7'!24g'?2 - (5.3.147) 


is symmetric, so that A is similar to a symmetric matrix. Hence, all the eigenvalues 
of A are real. 


d) Significance of the Onsager Relations: Fluctuation Dissipation Theorem 
The Onsager relations are for a set of macroscopically observable quantities and thus 
provide an easily observed consequence of detailed balance, which is itself a conse- 
quence of the reversibility of microscopic equations of motion, as outlined in (b) 
above. However, to check the validity in a given situation requires a knowledge of 
the covariance matrix ga. 

Fortunately, in such situations, statistical mechanics gives us the form of the 
stationary distribution, provided this is thermodynamic equilibrium in which de- 
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tailed balance is always satisfied. The principle is similar to that used by Langevin 
(Sect.1.2.2). We shall illustrate with an example; the derivation of Nyquist’s 
formula 

We assume an electric circuit as in Fig. 5.4 in which there is assumed to be a 
fluctuating voltage arising from the system having a nonzero temperature, and a 
fluctuating charge, which arises because the system 1s attached to a large neutral 
charge reservoir, e.g., an ionic solution. The electrical equations arise from con- 
servation of an electric charge gq and Kirchoff’s voltage law. The charge equation 
takes the form 


dq. 
a = i—yq + Aq(t) (5.3.148) 


in which we equate the rate of gain of charge on the capacitor to the current i, less a 
leakage term yq into the reservoir, plus a fluctuating term Aq(t), which arises from 
the reservoir, and whose magnitude will be shortly calculated. 


F(t) C 


Fig. 5.4. Electric circuit used in the derivation of 
Nyquist’s formula 


Kirchoff’s voltage law is obtained by adding up all the voltages around the cir- 
cuit, including a possible fluctuating voltage AV(t): 


Fa To Rt AMO], (5.3.149) 


We now assume that Aq(t) and AV(t) are white noise. We can write in the most 
general case, 


Aq(t) = 8,6,(t) + 51262(t) (5.3.150) 


FAV(t) = bukilt) + busklt) (5.3.151) 


in which é,(¢) and €,(t) are uncorrelated Langevin sources, i.e., 
Ci(t)dt = dW,(t) (5.3.152) 
G2(t)dt = dW,{(t) , (5.3.153) 


where W,(t) and W,(t) are independent Wiener processes. 
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Thus, here 
y —-!l 
A es|) 4 R |- (5.3.154) 
LC OL 


The total energy in the system ts 


oy ee ad 
E= =i +5n9. (5.3.155) 


From statistical mechanics, we know that p,(q, i) is given by the Boltzmann Distri- 
bution at temperature 7, 1.e., 


p.(q, 1) = WM exp (— E/kT) 


=W (— hse 15 
= W exp |— se at) Cons) 
where k is the Boltzmann constant, so that the covariance matrix is 
kTC 0 
c= . (5.3.157) 
0 KT/L 
The Onsager symmetry can now be checked: 
ykTC —kT/L ? 
Ao = — : (5.3.158) 
kT/L RkT/L? 


For this system q, the total charge is even under time inversion and i, the current, 1s 
odd. Thus, 


@ = diag (1, — 1) (5.3.159) 


and it is clear that 


(Ao). = — (Ao)a (5.3.160) 
is the only consequence of the symmetry and is satisfied by (5.3.158). 

Here, also, 

Pcs Ueiea") 247 |") : | 53.161 

— Oo CO = Je 
0 R/L’? ( ) 

so that 

by. = by, = 0 (5.3.162) 

by, = S2kTyC (5.3.163) 


by. = /2kTR/L 
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and we see that B,, = B,, = 0, as required by physical intuition, which suggests 
that the two sources of fluctuations arise from different causes and should be 
independent. 

These results are fluctuation dissipation results. The magnitudes of the fluctu- 
ations b,, are determined by the dissipative terms y and R. In fact, the result (5.3.163) 
is precisely Nyquist’s theorem, which we discussed in Sect. 1.4.4. The noise voltage 
in the circuit is given by 


AV(t) = ./2kTR E(t) (5.3.164) 
so that 
<AV(t + t)AV(t)S = 2kTRO(t) (5.3.165) 


which is Nyquists’ theorem in the form quoted in (1.4.49). 
The terms y and R are called dissipative because they give rise to energy dis- 
Sipation; in fact, deterministically (i.e. setting noise equal to 0), 


OF ee! BE a 
i?R C (5.3.166) 
which explicitly exhibits the dissipation. 


5.3.7 Eigenfunction Methods in Many Variables—Homogeneous Processes 
Here we shall proceed similarly to Sect.5.2.5. We assume the existence of a complete 


set of eigenfunctions P,(x) of the forward Fokker-Planck and a set Q,(x) of the 
backward Fokker-Planck equation. Thus, 


=e 0{A(x)P,(x)] ae 4 2 0,0 ;[B,(x)P,(x)] = —AP,(x) (5.3.167) 


D1 A(X)OQ (x) + $ D5 Bi(x)00,Q.(x) = —1'Q,(x) . (5.3.168) 


Whether Q,,(x) and P,(x) satisfy absorbing or reflecting boundary conditions, one 
can show, in a manner very similar to that used in Sect.5.2.4, that 


—(A — 2") f dx P,(x)Q,(x) = 0 (5.3.169) 


so that the P,(x) and Q,,(x) form a bi-orthogonal set. However, the functional 
relationship between the P, and the Q, only exists if detailed balance with all ¢, = 1 
is valid. We assume 


f dx P,(x)Q,(x) = Ou ; (5.3.170) 


if the spectrum of eigenvalues / is discrete. [The Kronecker 6,,, is to be replaced 
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by 6(A — 2,) when the spectrum is continuous, except where we have reflecting 
boundary conditions, and 4 = Ois stationary. The normalisation of p,(x) then gives 


f dx P,(x)Q.(x) = f dx P(x) = 1 (5.3.171) 


so that there is also a discrete point with zero eigenvalue in the spectrum then.] 

For, if detailed balance is valid, we have already noted in Sect.5.3.5 (iii) that 
p(&x, — t)/p,(x) is a solution of the backward Fokker-Planck equation so that, 
from the uniqueness of solutions, we can say 


Q(X) = m Py (é x)/ps (x). (5.3.172) 


Here, 7, = + | but is otherwise undetermined. For, if A = 2’, we can then write 
(5.3.170) in the form 


Pi (x) Plex) _ 


pe) 


ny. (5.3.173) 


We cannot determine 7, a priori, but by suitable normalisation it may be chosen 
+1. If, for example, all e, are — 1 and P,(x) is an odd function of x, it is clear that 7, 
must be —1. 


i) Even Variables Only: Negativity of Eigenvalues. If all the ¢, are equal to one, 
then we can write Ri 


QO,(x) = P,(x)/p,(x) ¢ (5.3.174) 


and 7, can always be set equal to one. 

Hence, the expansions in eigenfunctions will be much the same as for the one- 
variable case. 

The completeness of the eigenfunctions is a matter for proof in each individual 
case. If all the €, are equal to one, then we can show that the Fokker-Planck oper- 
ator is self adjoint and negative semi-definite in a certain Hilbert space. 

To be more precise, let us write the forward Fokker-Planck operator as 


L(x) = —DAAlz) + $¥90,B (2) (5.3.175) 


and the backward Fokker-Planck operator as 


L*(x) = a Afx)d, + 4 2 B,,(x)0,0, . (5.3.176) 


Then the fact that, if all e, = 1, we can transform a solution of the forward FPE 
to a solution of the backward FPE by dividing by p,(x) arises from the fact that for 
any f(x), 


L(x)f(x)p.(x)] = p(x) LZ *(x)f()) - (5.3.177) 


>.3  bkokker-Planck Equations 1n Several Uimensions 1o/ 
Define a Hilbert space of functions f(x), g(x) ... by the scalar product 


alae (5.3.178) 
Then from (5.3.177), 
f(x) g(x) 
(f, £8) = [dx 29 o(x)| 8 p(x) (5.3.179) 
= [ dxf) 2% )| | (5.3.180) 


and integrating by parts, discarding surface terms by use of either reflecting or 
absorbing boundary conditions 


_ J dx ae S(x\if(x)) . (5.3.181) 


Thus, in this Hilbert space, 
(f, 2g) = (8, Lf). (5.3.182) 


This condition is sufficient to ensure that the eigenvalues of &(x) are real. To 
prove negative semi-definiteness, notice that for even variables only, [see(5.3.75)] 


ed aaa 0,[B.,(x)p.(x)l/2p,(x) (5.3.183) 


so that for any p(x), 


F(x)p(x) = Fa, |— WEP) +. 49,18 (x p02) 


=F 38, {By (x)p.(x)8,[p(x)/p(x)]} - (5.3.184) 


tJ 


Hence, 
(p, Lp) =F § dx plwip.lx) D8, (By(x)p.(x)jLp(@)/P@)) 
and integrating by parts (discarding surface terms), 


(p, Zp) = — J dx B,,(x)p,(x)0,[p(x)/p,(x)]0,[p(x)/P.(x)] (5.3.185) 
< 0 


since B,,(x) is positive semi-definite. 
Hence, we conclude for any eigenfunction P,(x) that A is real, and 


A>0 (5.3.186) 
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(remember that —A is the eigenvalue). 


ii) A Variational Principle. Combined with this property is a variational principle. 
For, suppose we choose any real function f(x) and expand it in eigenfunctions 


f(x) = 2. a,P,(x). (5.3.187) 


Further, let us fix (ff) = f dx f(x)?/p,(x) = 1. 
Then, 


—U, Lf) = Lda} 

and (5.3.188) 
GA) = 2 Q} « 

Clearly — (f, Af) has its minimum of zero only if only the term A = 0 exists, L.e., 


ay #0 
a, = 0 for A#0. 


(5.3.189) 


Now choose f(x) orthogonal to P,(x) so aj = 0, and we see that the minimum of 
(f, Lf) occurs when 4 


and (5.3.190) 
a, =0 for all other A, 


where A, is the next smallest eigenvalue. This means that P,,(x) is obtained by 
minimising —(p, “p) [which can be put in the form of (5.3.185)], subject to the 
condition that 


Sas. - (5.3.191) 
P, o/ = : 


This method can yield a numerical way of estimating eigenfunctions in terms of 
trial solutions and is useful particularly in bistability situations. It is, however, lim- 
ited to situations in which detailed balance is valid. 


iii) Conditional Probability. Assuming completeness, we find that the conditional 
probability can be written 


p(x, t|xX>, 0) = 2 P,(x)O,(x,)e™ . (5.3.192) 


iv) Autocorrelation Matrix. If detailed balance is valid, for the stationary auto- 
correlation matrix [using (5.3.172)] we have 


Jo BR WVIIAWE BD sees swan aoragrere er were 2 - ~ S osacs eS 


G(t) = <x(t)xT(0)> = >) y,e7*"[ J dx xP,(x)] (J dx xP,(x)]"e (5.3.193) 


which explicitly satisfies the condition 
G(t) = eG" (te (5.3.194) 


derived in Sect. 5.3.4b. 


v) Spectrum Matrix. The spectrum matrix is 


S() = = | e-!mG(t)dt 


(5.3.195) 
5 fi eG (1) dt + [ae Gr()at} | 
If we define, for convenience, 
U, = fds xP,(x), (5.3.196) 
then 
S(@) = 5 > F es — U,Uje +5 “a tut |. (5.3.197) 


If any of the J are complex, from the reality of &(x) we find that the eigenfunction 
belonging to A* is [P,(x)]*, 4, = nj and [U,]* = U,*. The spectrum is then ob- 
tained by adding the complex conjugate of those terms involving complex eigen- 
values to (5.3.197). 

In the case where € = | and hence 7, = |, the spectrum matrix has the simpler 
form 


(5.3.198) 


which is explicitly a positive definite matrix. The spectrum of a single variable 
q made up as a linear combination of the x by a formula such as 


g=m-x (5.3.199) 
is given by 
l A(m-U,) 
S (oo) = m'S(o)m = — ¥ mee (5.3.200) 


and is a strictly decreasing function of w. 
In the case where ¢ + 1, the positivity of the spectrum is no longer obvious, 
though general considerations such as the result (3.7.19) show that it must be. 
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An important difference arises because the A now need not all be real, and this 
means that denominators of the form 


1/[AZ + (a — A,)’] (5.3.201) 


can occur, giving rise to peaks in the spectrum away from w = 0. 


5.4 First Exit Time from a Region (Homogeneous Processes) 


We wish to treat here the multidimensional analogue of the first passage time 
problem in one dimension, treated in Sect.5.2.7. As in that section, we will restrict 
ourselves to homogeneous processes. 

The analogous problem here is to compute the earliest time at which a particle, 
initially inside a region R with boundary S, leaves that region. 

As in the one-dimensional case, we consider the problem of solving the back- 
ward Fokker-Planck equation with an absorbing boundary condition on S, namely, 


p(x’, t|x, 0) = 0 (x € S). (5.4.1) 
The probability that the particle, initially at x, is somewhere within R after a time ¢ 
1S 

' 
G(x, t) = { dx'p(x’, t|x, 0) (5.4.2) 
R 
and if T is the time at which the particle leaves R, then 


Prob(T > t) = G(x, t). (5.4.3) 


Since the process is homogeneous, we find that G(x, t) obeys the backward Fokker- 
Planck equation 


,G(x, t) = 3) Ad@)8,G(x, t) + 25 By(2)8A,G(e, 1). (5.4.4) 


The initial conditions on (5.4.4) will arise from: 


i) p(x’, O|x, 0) = d(x — x’) (5.4.5) 
so that 
G(x, 0) = 1 xEcR (5.4.6) 


= 0 elsewhere; 
1i) the boundary condition (5.4.1) requires 


G(x, t)=0 xes. (5.4.7) 
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As in Sect. 5.2.7, we find that these imply that the mean exit time from R starting at 
x, for which we shall use the symbol 7(x), satisfies 


>: A{x)0,T(x) + 2 >) Bi,(x)0,0,T(x) = —1 (5.4.8) 
with the boundary condition 
T(x) =0 xes (5.4.9) 


and the mth moments 


T (2) ST"). = j t" G(x, t)dt (5.4.10) 
satisfy 

—nT,_,(x) = > A,j(x)o,T,(x) + 2 B,(x)0;0;T,(x) (5.4.11) 
with the boundary conditions 

T(x)=0 xeES. (5.4.12) 
Inclusion of Reflecting Regions. It is possible to consider that S, the boundary of 
R, is divided into two regions S, and S, such that the particle is reflected when 


it meets S, and is absorbed when it meets S,. The boundary conditions on G(x, ft) 
are then, from those derived in Sect. 5.2.4, 


2 n,B,(x)d,G(x,t)=0 (xe S,) (5.4.13) 
G(x,t)=0 (xeES,) (5.4.14) 

and hence, 
2 n,B,(x)0,T,(x)=0 (x € S,) (5.4.15) 
T(x)=0 (xeES,). (5.4.16) 


5.4.1 Solutions of Mean Exit Time Problems 


The basic partial differential equation for the mean first passage time is only simple 
to solve in one dimension or in situations where there is a particular symmetry 
available. Asymptotic approximations can provide very powerful results, but these 
will be dealt with in Chap.9. We will illustrate some methods here with some 
examples. 


b/Z >. ENE FOKKCI-FidlHitKh CYyudaueu 


a) Ornstein-Uhlenbeck Process in Two Dimensions (Rotationally Symmetric) 
we suppose that a particle moves according to 


dx = —kx dt+ /DdwW,(t) 


(5.4.17) 
dy = —kydt+./D dwt) 
and want to know the mean exit time from the region 
rt+y<a (5.4.18) 


given the initial position is at Xo, Yo. 
This is easily reduced to the one-dimensional problem for the variable 


r= /x? 4 y (5.4.19) 
by changing variables as in Sect.4.4.5, namely, 
dr = (—kr + } D/r)dt + VD dW(t) (5.4.20) 


and we want to know the mean exit time from the region (0, a). This can be solved 
by using (5.2.165) with the replacements 


U(x) — tkr? — 4D logr 


DD: = 4) 

X» ma ¢ (5.4.21) 
a + /xt + y= Mo P 

—oo —0. 


Thus, 7(r, — a) 
zs 5 f y— exp[ky?/D]dy f z exp(—kz?/D)dz . (5.4.22) 
ro 0 


The problem is thus essentially one dimensional. This does not often happen. 


b) Application of Eigenfunctions 
Suppose we use the eigenfunctions Q,(x) and P,(x) to expand the mean first passage 
time as 


T(x) = >) t,Q,(*). (5A.23) 


We suppose that the P,(x) and Q,(x) satisfy the boundary conditions required 
for the particular exit problem being studied, so that 7(x) as written in (5.4.23) 
satisfies the appropriate boundary conditions. 

We then expand 


—1= 10,8), (5.4.24) 


where 


v.*t A Abe MALL 2 dade LEULLE aA INCBIVILIL AJ 


I, = — f dx P,(x). (5.4.25) 


Inserting (5.4.23) into (5.4.8) yields 


I,=—h (5.4.26) 
so that 
l 
T(x) = 2 7 QO, (x) | dx’ P,(x’). (9.4.27) 
R 


The success of the method depends on the knowledge of the eigenvalues satisfying 
the correct boundary conditions on S and normalised on R. 


c) Asymptotic Result 

If the first eigenvalue A, is very much less than all other eigenvalues, the series 
may be approximated by its first term. This will mean that the eigenfunction Q, 
will be very close to a solution of 


S A(x), f() + FD B,(2)9,9; f(x) = 0 (5.4.28) 
since A, is very small. Hence, 


O\(x)~ K (5.4.29) 


where K is a constant. Taking account of the bi-orthonormality of P, and Q,, we see 


| = { dx P,(x)Q,(x) ~ K { dx P,(x) (5.4.30) 
so that 
TI les (5.4.31) 


The reasoning given here is rather crude. It can be refined by the asymptotic meth- 
ods of Chap.9. 


d) Application of the Eigenfunction Method 
Two-dimensional Brownian motion: the particle moves in the x y plane within a 
Square whose corners are (0, 0), (0, 1), (1, 0), (1, 1). The sides of this square are 
absorbing barriers. T(x, y) obeys 
D(?T . &T 
5 (53 al 2 (5.4.32) 


The eigenfunctions satisfying the boundary condition T = 0 on the edges of the 
square are 


P,, m(X, Y) = sin(nrx) sin(mnx) (5.4.33) 
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with 
On, m(X, }') = 4 sin(nnx) sin(mnx) (5.4.34) 


and n, m positive and integral. The eigenvalues are 


mD 


Jam = (+ m). (5 4.35) 


The coefficient 


f dx dy Pam(x,y)=0 (either n or m even) 
R 


(5.4.36) 
= a (m and n both odd). 
Hence, 
l 32 . 
T(x, y) = D ye 1? nm(m? + n?) sin(nmx) sin(mmy) : (5.4.37) 


odd 


5.4.2 Distribution of Exit Points ¢ 


This problem is the multidimensional analogue of that treated in Sect.5.2.8. 
Namely, what is the probability of exiting through an element dS(a) at a of the 
boundary S of the region R. We assume absorption on all S. 

The probability that the particle exits through dS (a) after time ¢ is 


g(a, x, t)|dS(a)| = _f at'Na, t’| x, 0)-dS(a) . (5.4.38) 


Fig. 5.5. Region and surface considered in Sect. 5.4.2 
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Similar reasoning to that in Sect.5.2.8 shows that g(a, x, t) obeys the backward 


Fokker-Planck equation 
2 A (x)0,2(a, x,t) + 4 2 B,(x)0,0,g(a, x,t) = 0,g(a, x, 1). 
The boundary conditions follow by definition. Initially we have 
g(a, x,0)=0 for x #a,xER 
and at all times 
g(a,x,t)=0 for xx~a,xeES. 
If x = a, then exit through dS(a) is certain, hence, 
g(a, a, t) dS(a) = | for all t 
or effectively 
g(a, x,t) = 6a — x) x € S, forall rs, 
where 6,(a@ — x) is an appropriate surface delta function such that 
{ |dS(a)|5(a — x)= 1. 
The probability of ultimate exit through dS(a) is 
n(a, x)|dS(a)| = g(a, x, 0) | dS(a) | . 


The mean exit time given that exit occurred at ais 
T(a, x) = J dt g(a, x, t)/n(a, x) 
0 


and in the same way as in Sect.5.2.8, we show that this satisfies 


(5.4.39) 


(5.4.40) 


(5.4.41) 


(5.4.42) 


(5.4.43) 


(5.4.44) 


(5.4.45) 


(5.4.46) 


and the boundary conditions is 


m(a, x)T(a, x) = 0, xeES. 


(5.4.47) 


(5.4.48) 


Further, by letting t—- co in the corresponding Fokker-Planck equation for 


g(a, x,t), we obtain the equation for x(a, x): 
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(5.4.49) 
The boundary condition for (5.4.49) is 
n(a, x) =0 axtx, xeS (5.4.50) 
and 
[ |dS(a)|x(a, x)= 1. (5.4.51) 
Thus, we can summarise this as 
nm(a, x) = 6,(a — x) xeS, (5.4.52) 


where 6,(a — x) is the surface delta function for the boundary S. 


6. Approximation Methods for Diffusion Processes 


The methods described in the previous two chapters have concentrated on exact 
results, and of course the results available are limited in their usefulness. Approxi- 
mation methods are the essence of most applications, where some way of reducing a 
problem to an exactly soluble one is always sought. It could even be said that most 
work on applicationsis concerned with the development of various approximations. 

There are two major approximation methods of great significance. The first is 
the small noise expansion theory which gives solutions linearised about a deter- 
ministic equation. Since noise is often small, this is a method of wide practical 
application, the equations are reduced into a sequence of time-dependent Ornstein- 
Uhlenbeck processes. Mostly the first order is used. 

Another large class of methods is given by adiabatic elimination, in which differ- 
ent time scales are identified and fast variables are eliminated completely. This 
forms the basis of the second half of the chapter. 


6.1 Small Noise Perturbation Theories 


In many physical and chemical problems, the stochastic element in a dynamical 
system arises from thermal fluctuations, which are always very small. Unless one 
measures very carefully, it is difficult to detect the existence of fluctuations. In such a 
case, the time development of the system will be almost deterministic and the 
fluctuations will be a small perturbation. 

With this in mind, we consider a simple linear example which is exactly soluble: 
a one-variable Ornstein-Uhlenbeck process described by the stochastic differential 
equation: 


dx = — kx dt+edwit) (6.1.1) 
for which the Fokker-Planck equation is 
0,p = 0,(kx p) + 42702 p. (6.1.2) 


The solutions of these have been previously investigated in Sects. 3.8.4, 4.4.4. Here 
€is a small parameter which is zero in the deterministic limit. However, the limit 
é — Q is essentially different in the two cases. 

In the stochastic differential equation (SDE) (6.1.1), as e — 0, the differential 
equation becomes nonstochastic but remains of first order in f, and the limit e — 0 
is therefore not singular. In contrast, in the Fokker-Planck equation (FPE) (6.1.2), 
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the limit e + 0 reduces a second-order differential equation to one of first order. 
This limit is singular and any perturbation theory is a singular perturbation theory. 
The solution to (6.1.1) is known exactly—it is (4.4.26) 


x(t) = ce + ef ek aw(r) (6.1.3) 
0 


which can be written 
X(t) = Xo(t) + ex, (1) (6.1.4) 


and this is generic; that is, we can normally solve in a power series in the small 
parameter e. Furthermore, the zero-order term x,(t) is the solution of the equation 
obtained by setting ¢ = 0, 1.e., of 


dx = — kx dt. (6.1.5) 


The situation is by no means so simple for the Fokker-Planck Equation (6.1.2). 
Assuming the initial condition c is a nonstochastic variable, the exact solution is the 
Gaussian with mean and variance given by 


(x(t)) = a(t) = ce (6.1.6) 
var {x(t)} = e2B(t) = e%(1 — e-**)/2k (6.1.7) 
¢ 
so that 


l 1 — 1 [x= a(t)P 


P(x, t|c, 0) = — Vinee (6.1.8) 


ce 2B(t) 

The solution for the conditional probability has the limiting form as e — 0 of 
Pex, t|c, 0) — d[x — a(t)] (6.1.9) 

which corresponds exactly to the first-order solution of the SDE, which is a deter- 

ministic trajectory along the path x(t) = c exp (— kt). However, p, cannot be ex- 


panded as a simple power series in ¢. To carry out a power Series expansion, one 
must define a scaled variable at each time; 


y = [x — a(t)]/e (6.1.10) 


so that a probability density for y is 
i dx 
BAY, £10, 0) = pelx, te, 0) 7 (6.1.11) 


eee ee eee ae 
~ / InB(O) exp| 5B en 
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The probability density for the scaled variable y has no singularity; indeed, we have 
no € dependence. 
The transformation (6.1.10) can be rewritten as 


x = a(t) + ey (6.1.13) 


and is to be interpreted as follows. From (6.1.12) we see that the distribtuion of y is 
Gaussian with mean zero, variance f(t). Equation (6.1.13) says that the deviation of 
x from a(t), the deterministic path, is of order ¢ as ¢ — 0, and the coefficient is the 
Gaussian random variable y. This is essentially the same conclusion as was reached 
by the SDE method. 

The general form of these results is the following. We have a system described 
by the SDE 4 


dx = a(x)dt + eb(x) dW(t). (6.1.14) 
Then we can write the solution as 
x(t) = x(t) + ex,(t) + e7x,(t) + ... (6.1.15) 


and solve successively for the x,(t). In particular, x,(t) 1s the solution of the deter- 
ministic equation 


dx = a(x)dt . (6.1.16) 
Alternatively, we consider the Fokker-Planck equation 

0,p = — 0,fa(x)p] + }e’02[5(x)’p] . (6.1.17) 
Then by changing the variable to the scaled variable and thus writing 

y=([x — x,(t)]/e (6.1.18) 

Bey, t) =ep(x,t|c,0) , (6.1.19) 
we can write the perturbation expansion 

BeCy, t) = Poly, t) + epi(y, t) + e’py, t) + «... (6.1.20) 


Here we will find that p,(y, t) is indeed a genuine probability density, t.e., is positive 
and normalised, while the higher-order terms are negative in some regions. Thus, it 
can be said that the Fokker-Planck perturbation theory is not probabilistic. In 
contrast, the SDE theory expands in a series of random variables x,(t), each of which 
has its own probability distribution. At every stage, the system is probabilistic. 
And finally, the most noticeable difference. The first term in the SDE perturba- 
tion theory is x9(t) which is the solution of the SDE obtained by setting ¢ = 0 in 
(6.1.1). In contrast, the first term po(y, ft) in (6.1.20) is not the solution of the equa- 
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tion obtained by setting e = 0 in (6.1.2). In general, it is a limiting form of a FPE 
for p.(y, t) obtained by setting e = O in the FPE for the scaled variable, y. 


6.2 Small Noise Expansions for Stochastic Differential Equations 


We consider a stochastic differential equation of the form 
dx = a(x)dt +- eb(x)dW(t) (6.2.1) 


in which ¢ is a small parameter. At this stage we exclude time dependence from a(x) 
and b(x), which is only necessary in order to simplify the algebra. The results and 
methods are exactly the same. We then assume that the solution x(t) of (6.2.1) can 
be written 


x(t) = x,(t) + ex,(t) + ext) + .... (6.2.2) 
We also assume that we can write 


a(x) = a(x) + ex, + ex, + ...) (6.2.3) 
= Ay(Xo) + € A,(Xq, X1) + €742(Xo, X1, X2) +... (6.2.4) 


The particular functional dependence in (6.2.4) is important and is easy to 
demonstrate, for ‘§ 


a(x) = a(%o + 3) e™Xn) é 


deg $2 Oem? (6.2.5) 


Formally resumming is not easy, but it is straightforward to compute the first few 
powers of ¢ and to obtain 


Ay(Xo) = Aa(Xo) 


d 
A\(Xo, X1) = X oe 
d 1 @ 5 
a(Xo, X15 X2) == XG ae) = yea =A) (6 2 6) 
8) 


da(x d?a(x 1 ,d'a(x 

Ax(Xo, X1, X2, X3) = Xy ae + X1X2 at ar rans ae 

Although it is not easy to write explicitly the full set of terms in general, it is easy to 
see that we can always write for n > 1, 


__ . da(xo) 


Xn) See, Xn dX, 


an(Xo, Xyy ++: =e A (Xo, Xiy --- Xn—1) ’ (6.2.7) 
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where it is seen that A, is independent of x,. In fact, it is easy to see directly from 
(6.2.5) that the coefficient of «” can only involve x, if this is contributed from the 
term p = I, and the only other possibility for é” to arise is from terms with m < n, 
which do not involve x,. 

The form (6.2.7) for a, is very important. It is clear that if we also require 


B(x) = bo(Xo) + € bi(Xo, X1) + €7B2(Xo, X1, X2) + ---, (6.2.8) 


then all the same conclusions apply. However, they are not so important in the 
perturbation expansion. 

We now substitute the expansions (6.2.2, 7, 8) in the stochastic differential equa- 
tion (6.2.1) and equate coefficients of like powers of e. We then obtain an infinite 
set of SDE’s. In these we use the notation 


k(x) = — oe (6.2.9) 
which simplifies the notation. 
We obtain 
dx, = a(x,)dt (6.2.10a) 
Ax, = [—Kk(Xo)Xn + An(Xo, --» Xn lat + Ba_i (Xo, --- Xp dW(t) . (6.2.10b) 


These equations can now be solved sequentially. Equation (6.2.10a) is a (possibly 
non-linear) ordinary differential equation whose solution is assumed to be known, 
subject to an initial condition. It is of course possible to assume independent 
nontrivial initial conditions for all the x,, but this is unnessarily complicated. It is 
simplest to write (setting ¢ = 0 as the initial time) 


X(0) = x(0) (6.2.11) 
x,(0) = 0 n>1. 
Assuming that the solution of (6.2.9) is given by 
X(t) = a(t), (6.2.12) 
the equation (6.2.10a) for n = | can be written as 
dx, = —k[a(t)]x,dt + bla(t)|dW(t) , (6.2.13) 


where we have noted from (6.2.5) that A, vanishes and by = b. 

This equation, the first of a perturbation theory, is a time-dependent Ornstein- 
Uhlenbeck process whose solution can be obtained straightforwardly by the 
methods of Sect.4.4.9, reduced to one dimension. The solution is obtained simply 
by multiplying by the integrating factor 
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exp{ j dt'k{a(t")}} 
and is 
Pee f bla) exp{— f K{a(s)ds} dwit’), (6.2.14) 


where the initial condition x,(0) = O has been included. 

For many purposes this form is quite adequate and amounts to a linearisation 
of the original equation about the deterministic solution. Higher-order terms are 
more complex because of the more complicated form of (6.2.10b) but are, in 
essence, treated in exactly the same way. In order to solve the equation for x,(t), 
we assume we know all the x,(t)forn < Nso that A, and b,_, become known (sto- 
chastic) functions of ¢ after substituting these solutions. Then (6.2.10b) becomes 


AX y = {—k[a(t)]xn ae Ay(t)} dt + by_\(t)dW(t) (6.2.15) 


whose solution is obtained directly, or from Sect.4.4.9, as 
xy(t) = f [An(t)dt’ + by_i(t’)\dW(t’)] exp {— { k[a(s)]ds} . (6.2.16) 
0 tf 


Formally, the procedure is now complete. The range of validity of the method and 
its practicability are yet unanswered. Like all perturbation theories, terms rapidly 
become unwieldy with increasing order. 


c 


6.2.1 Validity of the Expansion 


The expansion will not normally be a convergent power series. For (6.2.14) shows 
that x,(t) is a Gaussian variable, being simply an Ito integral with nonstochastic 
coefficients, and hence x,(t) can with finite probability assume a value greater than 
any fixed value. Thus, only if all power series involved in the derivation of (6.2.5) 
are convergent for all arguments, no matter how large, can we expect the method 
to yield a convergent expansion. 

We can, in fact, show that the expansion is asymptotic by using the results on 
dependence on a parameter given in Sect.4.3.7. 
We define a remainder by 


yAl8, t) = [x(t) — = e'x,(t)y/er*! , (6.2.17) 


where the x,(t) are solutions of the set of stochastic differential equations (6.2.10) 
with initial conditions (6.2.11). 
We then derive an equation for y,(t). We can write 


afx(t)] = af Dy ex) + yale, De" (6.2.18) 
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and we define a function 4@,,,[Xo, X1, X2..-.X» Y; ] by 
Gnu[Xo. Xi, ... X,Y, e] = ev? {af St eX, + e!Y] 
r=(0 
— Sea (Xo, Xi, ... Xp. (6.2.19) 
r=Q 


We require that for all fixed Xo, X1, ... Xn, Y, 


lim 4,4:[Xo Xi, --- Xw Y, €] (6.2.20) 
e-0 


exists. We similarly define b,[Xo, X1,--- Xn,» Y,€] and impose the same condition on it. 
This condition is not probabilistic, but expresses required analytic properties 
of the functions a(x) and b(x); it requires, in fact, that the expansions (6.2.4, 8) 
be merely asymptotic expansions. 
Now we can write the differential equation for y,(e, t) as 


AY, = GniilXo (t), X1(t), .-. Xa(t), Va» €] at 
+ bi[xo(t), x(t), --- Xp_a(t), Vas €] W(t) . (6.2.21) 
The coefficients of dt and dW(t) are now stochastic functions because the x,(t) are 


stochastic. However, the requirement (6.2.20) is now an almost certain limit, and 
hence implies the existence of the stochastic limits 


st-lim Gp, i[Xo(t)y X1(t), --- Xa(t)s Yar €) = Enar(t, Vn) (6.2.22) 
and 
st-lim b, [xo(t), x(t), ee Xn-(t), Vn é] = 5,(t, Yn) (6.2.23) 


which is sufficient to satisfy the result of Sect.4.3.7 on the continuity of solutions 
of the SDE (6.2.21) with respect to the parameter ¢, provided the appropriate Lip- 
schitz conditions (ii) and (iii) of Sect.4.3.7 are satisfied. Thus, y,(0, tf) exists as a 
solution of the SDE 


dy(0, t) = Gilt, va(O, 0)] + S,lt, yn(0, t)]dW(t) (6.2.24) 
which, from the definition (6.2.17) shows that 

x(t)— > e'x(t) ~ ent! (6.2.25) 
Hence, the expansion in power of é is an asymptotic expansion. 


6.2.2 Stationary Solutions (Homogeneous Processes) 


A stationary solution is obtained by letting t — oo. If the process is, as written, 
homogeneous and ergodic, it does not matter what the initial condition is. In this 
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case, one chooses x,(0) so that a[x,(0)] vanishes and the solution to (6.2.10a) 
1S 


Xo(t) = xX(0) = a (6.2.26) 


{where we write simply a instead of a(t)). 
Because of the initial condition §, at t =the solution (6.2.14) to the equation of 
order one is not a stationary process. One must either let t — oo or set the initial 


condition not at tf = 0, but at t = — oo. Choosing the latter, we have 

a(t) = f b(a) exp [—(t — t’)k(a) dW’). (6.2.27) 
Similarly, 

a(t) = f (Ae) de + bE ACW] exp [—( = 19K(@)] (6.2.28) 


where by A’ and bs_, we mean the values of A, and b,_, obtained by inserting the 
stationary values of all arguments. From (6.2.28) it is clear that xs(t) is, by construc- 
tion, stationary. Clearly the integrals in (6.2.27, 28) converge only if k(a) > 0, 
which will mean that only a stable stationary solution of the deterministic process 
generates a stationary solution by this method. This is rather obvious—the addition 
of fluctuations to an unstable state gerives the system away from that state. 


6.2.3 Mean, Variance, and Time Correlation Function 


If the series expansion in ¢ is valid in some sense, it is useful to know the expansion 
for mean and variance. Clearly 


(x(t)) = > é"«x,(t)> (6.2.29) 
var {x(t)} = DH (Xm(t)Xn—m(t)> — (Xm(t)> <Xn—_m(t)] - (6.2.30) 


Since, however, we assume a deterministic initial condition and x,(t) is hence deter- 
ministic, all terms involving x,(t) vanish. We can then work out that 


var {x(t)} = e’var{x,(t)} + 22° ¢x,(t), x.(t)> 
+ e*[2¢x,(t), x3(t)> + var {x,(t)}] + ... (6.2.31) 
and similarly, 
x(t), x(5)> = €7Cx1(t), x1(5)) + e110), X2(5)> + x19), x2(t))] 


+ e*[¢x,(t), x3(5)> + <x1(5), x3(t)> + ¢x2(4), x2(5))] 
Bh dn ch (6.2.32) 
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6.2.4 Failure of Small Noise Perturbation Theories 


a) Example: Cubic Process and Related Behaviour 
Consider the stochastic differential equation 


dx = —x*dt + edWit). (6.2.33) 


It is not difficult to see that the expansion conditions (6.2.20) are trivially satisfied 
for the coefficients of both dt and dW(t), and in fact for any finite t, an asymptotic 
expansion with terms x,(t) given by (6.2.16) is valid. However, at x = 0, it is clear 
that 


5 (x*)) = k(0) = 0, (6.2.34) 


and because x = OQ is the stationary solution of the deterministic equation, the per- 
turbation series for stationary solutions is not likely to converge since the exponen- 
tial time factors are all constant. For example, the first-order term in the stationary 
expansion is, from (6.2.27), 


x(t) = § dW’) = Wit) — W(—0) (6.2.35) 


which is infinite with probability one (being a Gaussian variable with infinite 
variance). 

The problem is rather obvious. Near x = 0, the motion described by (6.2.33) 
is simply not able to be approximated by an Ornstein-Uhlenbeck process. For 
example, the stationary probability distribution, which is the stationary solution 
of the FPE 


0,p = 0,(x*p) + he°d2p, (6.2.36) 
is given by 
px) = WY exp (—x*/2e7) (6.2.37) 


and the moments are 


Ky = ey rr (PEA ir(Z) — (reven) (6.2.38) 


— 0 (n odd). 


The lowest-order term of the expansion of the variance is proportional to ¢ to the 
first power, not e* as in (6.2.31). 

In this case, we must simply regard the cubic process described by (6.2.33) as a 
fundamental process. If we introduce the new scaled variables through the de- 
finitions 
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ey (6.2.39) 

t = t/e 
and use 

dAW(t/e) = dW(t)// e, (6.2.40) 
then the cubic process can be recuced to a parameterless form 

dy = —y3dt + dW(t). (6.2.41) 
Regarding the solution of (6.2.41) as a known quantity, we can write 

x(t) = Je yet) (6.2.42) 


so that the limit e—0 is approached like ./¢, and also with a slower time scale. 
This kind of scaling result is the basis of many critical phenomena. 

A successful perturbation theory in the case where a(x) behaves like x3 near x = 0 
must involve firstly the change of variables (6.2.39) then a similar kind of pertur- 
bation theory to that already outlined—but in which the zero-order solution is 
the cubic process. Thus, let us assume that we can write 


a(x) = —xc(x), ¢ (6.2.43) 


where c(x) is a smooth function with c(0) # 0. Then, using the transformations 
(6.2.39,40), we can rewrite the SDE as 


dy = —y*c(y/e)dt + b(y/ «)dW(t) . (6.2.44) 


If we expand y(t), c(y/¢), b(y/e) as series in ./¢, we obtain a perturbation 
theory. If we write 


V(t) = > ery (t), (6.2.45) 


then we get for the first two terms 


dyy = —yac(0)drt + b(0)dW(t) (6.2.46) 
dy, =~») 398e(0) + 85 (0)| dr + | FO] 4M). (6.2.47) 


We see that the equation for y, is in fact that of a time-dependent Ornstein-Uhlen- 
beck process with stochastic coefficients. Thus, in principle, as long as the cubic 
orocess is known, the rest is easily computed. In practice, not a great deal is in fact 
<nown about the cubic process and this kind of perturbation is not very practical. 
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b) Order v Processes (v odd) 
If instead we have the Stochastic differential equation 


dx = —x’dt + edWit), (6.2.48) 


we find a stable deterministic state only if v is an odd integer. In this case we make 
the transformation to scaled variables 


X= pertl+y) (6.2.49) 
f= Tt ertl-w/Uty) (6.2.50) 


and follow very similar procedures to those used for the cubic process. 


c) Bistable Systems 
Suppose we have a system described by 


dx = (x — x8)dt+ ¢dWwit) (6.2.51) 


for which there are three deterministic stationary states, at x = 0, +1. The 
state at x = 0 is deterministically unstable, and we can see directly from the per- 
turbation theory that no stationary process arises from it, since the exponentials in 
the perturbation series integrals (6.2.27, 28) have increasing arguments. 

The solutions x(t) of the deterministic differential equation 


dx/dt= x — x? (6.2.52) 
divide into three classes depending on their behaviour as t — oo. Namely, 


li) x0) =O0>-x,(t)=0 forall t 
lll) X(t) >O>x,(t)— 1. 


Thus, depending on the initial condition, we get two different asymptotic ex- 
pansions, whose stationary limits represent the fluctuations about the two deter- 
ministic stationary states. There is no information in these solutions about the 
possible jump from the branch x = 1 to the branch x = — 1, or conversely — at least 
not in any obvious form. In this sense the asymptotic expansion fails, since it does 
not give a picture of the overall behaviour of the stationary state. We will see in 
Chap. 9 that this results because an asymptotic expansion of behaviour characteris- 
tic of jumps from one branch to the other is typically of the order of magnitude 
of exp (— I /e?), which approaches zero faster than any power as e — 0, and thus is 
not represented in an expansion in powers of «. 


6.3 Small Noise Expansion of the Fokker-Planck Equation 


As mentioned in Sect. 6.1, a small noise expansion of a Fokker-Planck equation is 
a Singular expansion involving the introduction of scaled variables. Let us consider 
how this is done. 
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We consider the Fokker-Planck equation 

0,p = —0,[A(x)p] + }e702[B(x)p] . (6.3.1) 
We assume the solution of the deterministic equation to be a(t) so that 

d,a(t) = A[a(t)]. (6.3.2) 


and introduce new variables (y, s) by 


y = [x — a(t] /e (6.3.3) 
s=t (6.3.4) 
and p(y, s) = ep(x, t). (6.3.5) 


We note that 


pay | ap as_ at) a , ap 
DS) = Baar + Os Ol Braye os 2:9) 
la 
0,p(y, 5) = ar (6.3.7) 


¢ 
so that substituting into (6.3.1) we get, with the help of the equation of motion 


(6.3.2) for a(t) ; 
A — Ala | Qo? 
- = ~ Oy o {Atel eel Ales) pi +> y) ay? 5 {Bla(s) + ey] p} . (6.3.8) 


We are now in a position to make an expansion in powers of e. We assume that A 
and B have an expansion in powers of ¢ of the form 


A[a(s) + ey] = 2A 4,(s)e" y" (6.3.9) 
Bla(s) + ey] = > B,(s)ery" (6.3.10) 


and expand p in powers of é: 
p= >" pyc". (6.3.11) 
Substituting these expansions into the FPE (6.3.8), we get by equating coefficients 


Bo _ _ As) 2 (yp.) + + Bas) 5 (6.3.12) 
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Op, 0.7 : ~ ee ee ae ee 
= = — ay Als): + A,(s)v*ho] + J 2 Bolspr + By(s)ypo) (6.3.13) 


and, in general, 


Op, a ; ae 
: — - 5 | 52" m ems (S)Pm| 7 + e| Sy B,-n(8)Pm|- (6.3.14) 


m=0 oy 


Only the equation for f, is a FPE and, as mentioned in Sect.6.1, only jy is a proba- 
bility. The first equation in the hierarchy, (6.3.12), is a time-dependent Ornstein- 
Uhlenbeck process which corresponds exactly to (6.2.13), the first equation in the 
hierarchy for the stochastic differential equation. Thereafter the correspondence 
ceases. 

The boundary conditions on the f, do present technical difficulties since the 
transformation from x to ) is time dependent, and a boundary at a fixed position in 
the x variable corresponds to a moving boundary in the y variable. Further, a 
boundary at x = acorresponds to one at 


y= ie — ats) (6.3.15) 
which approaches + co as e — 0. There does not seem to be any known technique 
of treating such boundaries, except when a = + o, so that the y boundary is also 
at --co and hence constant. Boundary conditions then assume the same form in the 
y variable as in the x variable. 

In the case where the boundaries are at infinity, the result of the transformation 
(6.3.3) is to change a singular perturbation problem (6.3.1). (in which the limit 
é— 0 yields an equation of lower order) into an ordinary perturbation problem 
(6.3.8) in which the coefficients of the equation depend smoothly one, and the limit 
e—0 is an equation of 2nd order. The validity of the expansion method will 
depend on the form of the coefficients. 


6.3.1 Equations for Moments and Autocorrelation Functions 


The hierarchy (6.3.14) is not very tractable, but yields a relatively straightforward 
procedure for computing the moments perturbatively. We assume that the boun- 
daries are at + oo so that we can integrate by parts and discard surface terms. 
Then we define 


Cyt] = 3 e Mx(t) : (6.3.16) 
Then clearly 


M(t) = f{ dy y"6,(), t) (6.3.17) 


Then using (6.3.12-14), we easily derive by integrating by parts 
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MMO Ln From) Mente) + CY B,_.(0 Mar] 63.18) 


which is a closed hierarchy of equations, since the equation for M(t) can be solved 
it all M(t) are known for m<randp<n+r. 
Writing out the first few for the mean M!(t) and mean square M2(t), we find 


CA) = A) M0) (6.3.19) 

MAD _ Any) + ACMA) (6.3.20) 

ict = A\(t)M4t) + At)M20)+ A(t) Xt) (6.3.21) 

ome = 24,(t)M3(t) + Bot) (6.3.22) 

AMD _ 2A MH) + 2ALOMAQ) + BOM) (6.3.23) 

a = 34,(t)MX(t) + 3B,(t)MU(t). (6.3.24) 
. 


In deriving the last two equations we note that 


M(t) = { dy p,(y, t) (6.3.25) 
and using 

faypO,o=1= de Met) (6.3.26) 
we see that 

Mt) = 1 (6.3.27) 

MUt)=0 r#l. (6.3.28) 


The equations are linear ordinary differential equations with inhomogenieties that 
are computed from lower equations in the hierarchy. 


a) Stationary Moments 
These are obtained by letting t —- co and setting the left-hand side of (6.3.18) equal 
to zero. (All coefficients, A, B, etc. are taken time independent.) 

From (6.3.19-22) we find 


M\(co) = 0 
M3(co) = —}B)/A, 
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M3(co) = 0 
M1(co) = —A,M2(00)/A, = 4 A,By/(A,)” (6.3.29) 
M3(co) = —A,M3(o)/A, = 0 
M}(co) = —[4,M}(co) or A,M3(co)]/A, = 0. 


Thus, the stationary mean, variance, etc., are 


(x), = @ + e[M{(co) + eM j(0o) + &°M3(co)] 
=a-+t he?4,B,/(A,)” (6.3.30) 


var {x}, = <x”), — <x,>? 
= ((a + ey)>, — Ca + ey? = e’var {y}, 


Oe ae es 
-_ -> o/ Ay to order e” . (6.3.31) 


The procedure can clearly be carried on to arbitrarily high order. Of course 
in a one-variable system, the stationary distribution can be evaluated exactly and 
the moments found by integration. But in many variable systems this is not always 
possible, whereas the multivariate extension of this method is always able to be 
carried out. 


b) Stationary Autocorrelation Function 


The autocorrelation function of x is simply related to that of y in a stationary 
state by 


(x(t)x(0)>, = a? + &*< y(t) (0); (6.3.32) 


and a hierarchy of equations for ¢y(t)y(0)> is easily developed. Notice that 


Ale + eno) — Ae) ny(ty! 


£ AO"). = ¢ 
+ 4n(n — 1)Bla + ey(t)] y(t)’ } yO); (6.3.33) 


which can be derived by using the FPE (6.3.1) for p(y, t| yo, to) and integrating by 
parts, or by using Ito’s formula for the corresponding SDE. 


Using the definition of A,, B, in (6.3.9.10) and expanding A and B in a power 
series, we get 


< CWE) YO))s = Dy ef nda y(t)**"y(0)), 


“+ med Bc w(0yr*2y10)>,| (6.3.34) 


These equations themselves form a hierarchv which can be simnlv solved in a power 
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series in €. Normally one is most interested in ¢y(t)y(O),, which can be calculated 
to order €?, provided one knows <y(t)?y(0),> for p < q+ 1. We have the initial 
condition 


<0)" (0), = <¥"™") (6.3.35) 


and the stationary moments can be evaluated from the stationary distribution, 
or as has just been described. 


6.3.2 Example 


We consider the Fokker-Planck equation 


Op : é’ 0p 
ar 2 Ix +x)p) +> 6 xt (6.3.36) 


for which we have [in the stationary state a(t) = 0] 


A, = —| 
A, = 0 
A, = —| 
A, =0(q > 3) ; (6.3.37) 
B, = $9,.0 
a =0. . 
Using (6.3.30, 31) we have 
a (6.3.38) 


var {x}, = 7/6. 
For convenience, let us use the notation 
cA(t) = <y"(t)y0))s (6.3.39) 


so that the equations for the c, and c; are 


a ae l —¢? C} 
ae (6.3.40) 
tet 3 Cc 
dt : 


[the equations for the c,, decouple from those for c2,,, because B(x) is constant and 


A(x) is an odd function of x]. 
It is simpler to solve (6.3.40) exactly than to perturb. The eigenvalues of the 


matrix are 
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Ay = —24+ V1 — 2 


ene (6.3.41) 
A, = —2 —/1 — #? 
and the corresponding eigenvectors 
I+ /1 — | 
es = 1 
(6.3.42) 
a 
“= 
l 
The solution of (6.3.40) can then be written 
c(t) _ aes 
= ae Airy, + axe Aaty, (t > 0) ‘ (6.3.43) 
c(t) 
The initial condition is 
c,(0) = Cy; (6.3.44) 


c,(0) = <y"), - 


We can compute <y*>, using the moment hierarchy (6.3.10) extended to Mg: we 
find 


3 BoM3_ 1 (6.3.45) 


es A ee 
og 0 


then ¢y*) = 1/12. 


Hence, we obtain 


is = 6.3.46 
ae = @6, ++ aU, (6.3.46) 


which have the solutions 


a =t(l+ VT avi —@ 
(6.3.47) 


a =(-14+/T- avi. 
The correlation function is, to 2nd order in e (many terms cancel) 
soe z eMail (6.3.48) 


Notice that the eigenvalues 7, and A, depend on e?. Any attempt to solve the 
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system (6.3.40) perturbatively would involve expanding exp (A,t) and exp(J,¢) in po- 
wers of e? and would yield terms like t%exp (—2t) which would not be an accurate 
representation of the long-time behaviour of the autocorrelation function. 

The x correlation function Is 


¢x(t)x(0)>, = €7¢,(t) (6.3.49) 


and the spectrum 


S(w) = & f dtee,(t)/2n (6.3.50) 
g? ] 
cae rer) ; (6.3.51) 


6.3.3 Asymptotic Method for Stationary Distributions 


For an arbitrary Fokker-Planck equation 
0,P = — D1 9,A(x)p + he’ ps 0,0,B,,(x)p , (6.3.52) 
one can generate an asymptotic expansion for the stationary solution by setting 


pal) = exp [—$(x)/27] : (6.3.53) 


in terms of which we find 


[> A(x)0,6 + $ 2 B,j(x)0,¢0,¢) + iegp 0,A,(x) + 2 0,B,,0,¢ 
+ 21 40,B,(x)] = 0. (6.3.54) 


The first term, which is of order e°, is a Hamilton Jacobi equation. The main sig- 
nificance of the result is that an asymptotic expansion for #(x) can be, in prin- 
ciple, developed: 


b(x) = >) 8"6,(*) (6.3.55) 


where ¢,(x) satisfies 


ba Afx)0ibo + 4 2 By j(X)0 90090 = (6.3.56) 


Graham and Tel [6.8, 9] have recently shown how equation (6.3.56) may be solved 
in the general case. Their main result is that solutions, though continuous, in 
general have infinitely many discontinuities in their derivatives, except in certain 
special cases, which are closely related to the situation in which the FPE satisfies 
potential conditions. 
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6.4 Adiabatic Elimination of Fast Variables 


It is very often the case that a dynamical system can be described by stochastic equa- 
tions which have widely different response times and in which the behaviour on a 
very short time scale is not of interest. The most natural example of this occurs in 
Brownian motion. Here it is normal to observe only the position of the Brownian 
particle, but the fundamental equations involve the momentum as well, which is 
normally unobservable. Thus, Langevin’s equation (1.2.14) can be rewritten as an 
equation for position and velocity as 


dx 
me = — po + JIB Et). (6.4.2) 


If we interpret the equations as Ito stochastic differential equations, the method of 
solution has already been given in Sect.4.4.6. However, it is simpler to integrate 
(6.4.2) first to give the solution 


v(t) = v(0) exp (—ft/m) + a fexp [—B(t — r"yfmece dt’ (6.4.3) 


We now want to consider the situation in which the friction coefficient £ 1s not small 
but the mass m is very small. Then for times ¢ such that 


t>m/[Bp=tT, (6.4.4) 


the exponential in the first term will be negligible and the lower limit in the integral 
will be able to be extended to — co, without significant error. Hence, 


ee vAkT f exp [—(t — r’)/2]e()dt' (6.4.5) 


Here t will be called relaxation time since it determines the time scale of relaxation 
to (6.4.5). 
Let us define 


n(t,t) =t" f exp[—(t — t’)/c]d W(t’) (6.4.6) 


which is, from Sect.4.4.4, a stationary Ornstein-Uhlenbeck process. The correlation 
function Is 


Cnt, tlt’, *)> = a exp (— It — 1/9) (6.4.1) 


—.8(t— 1’). (6.4.8) 
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We see that the limit t —- 0 corresponds to a white noise limit in which the correla- 
tion function becomes a delta function. 
Thus, we can write (6.4.1) as 


a at n(t, 0) (6.4.9) 


and in the limit t —- 0, this should become 
dx ga 
Tie 3 E(t). (6.4.10) 


An alternative, and much more transparent way of looking at this is to say that in 
(6.4.2) the limit m —- 0 corresponds to setting the left-hand side equal to zero, so 
that 


ut) = Jo C(t). (6.4.11) 


The reasoning here is very suggestive but completely nonrigorous and gives no idea 
of any systematic approximation method, which should presumably be some 
asymptotic expansion in a small dimensionless parameter. Furthermore, there 
does not seem to be any way of implementing such an expansion directly on the 
stochastic differential equation—at feast to the author’s knowledge no one has 
successfully developed such a scheme. - 

The Fokker-Planck equation equivalent to (6.4.1, 2) for the distribution func- 
tion p(x, v, t) is 


ba - Sen  e arr 


We define the position distribution function f(x, t) by 
B(x, t) = [ dup(x, v, t). (6.4.13) 


Then we expect that, corresponding to the “‘reduced”’ Langevin equation (6.4.10) 
the FPE for A(x, t) is 


he, 
Bo 


hs 


(6.4.14) 


he 


We seek a way of deriving (6.4.14) from (6.4.12) in some perturbative manner, so 
that we obtain higher corrections in powers of the appropriate small parameter. 

More generally, we can consider Brownian motion in a potential for which the 
Langevin equations are (Sect.5.3.6) 
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(6.4.15) 
a Bu — V'(x) + S2BkKT C(t) . 


The limit of large B should result in very rapid relaxation of the second equation 
to a quasistationary state in which dv/dt —-0. Hence, we assume that for large 
enough B, 


v= — FV) — SIPRT EO) (6.4.16) 
and substituting in (6.4.15) we get 

dx ee VkT 

T= PIV + LE (6.4.17) 


corresponding ito a FPE for p(x) known as the Smoluchowski equation: 


0 


Op _ pile 
Ox 


a= p-| VCop + kre | (6.4.18) 


In this case we have eliminated the fast variable v, which is assumed to relax 
very rapidly to the value given by (6.4.16). 

This procedure is the prototype of all adiabatic elimination procedures which 
have been used as the basis of Haken’s slaving principle [6.1]. The basic physical 
assumption is that large f (or, in general, short relaxation times) force the variables 
governed by equations involving large B (e.g., v) to relax to a value given by assum- 
ing the slow variable (in this case x) to be constant. Such fast variables are then 
effectively slaved by the slow variables. 

Surprisingly, the problem of a rigorous derivation of the Smoluchowski equa- 
tion and an estimation of corrections to it, has only rather recently been solved. 
The first treatment was by Brinkman [6.2] who only estimated the order of magni- 
tude of corrections to (6.4.18) but did not give all the correction terms to lowest 
order. The first correct solution was by Stratonovich [Ref. 6.3, Chap. 4, Sect. 11.1]. 
Independently, Wilemski [6.4] and Titulaer [6.5] have also given correct treatments. 

In the following sections we will present a systematic and reasonably general 
theory of the problem of the derivation of the Smoluchowski equation and correc- 
tions to it, and will then proceed to more general adiabatic elimination problems. 
The procedure used is an adaptation of projection operator methods, which have 
been used in Statistical physics, quantum optics, and related fields for many years. 
These methods can be formulated directly in the time domain, but we will find it 
more convenient to use a Laplace transform method, which was that originally used 
by Wilemski. The manner of presentation is similar to that of Papanicolaou [6.6], 
who has given a rigorous basis to its use in some problems. However, the de- 
monstrations used here will be largely formal in character. 
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6.4.1. Abstract Formulation in Terms of Operators and Projectors 


Let us consider the rescaled form of the Fokker-Planck equation (6.4.12) derived 
as (5.3.97) in Sect. 5.3.6, which we can write in the form 


oP — (yl, + Lap, (6.4.19) 


where L, and L, are differential operators given by 


3 3 
L,=5 ( ES | (6.4.20) 
| RE TET GCG (6.4.21) 
as, Ya 4. 


We would like to derive an equation for the distribution function in y, 


p(y, t) = J du plu, y, t) (6.4.22) 


which would be valid in the limit where y becomes very large. 
It is expected that an approximate solution to (6.4.19) would be obtained by 
multiplying p(y, t) by the stationary distribution of 


¢ 


0 

5. = Lip, : (6.4.23) 
that is, by 

(2n)~!/? exp (—}u?). (6.4.24) 


The reasoning is that for large y, the velocity distribution is very rapidly therma- 
lised or, more crudely, we can in (6.4.19) neglect L, compared to L, and the solu- 
tion is a function of y multiplied by a solution of (6.4.23), which approaches a 
stationary solution in a time of order y~', which will be very small. 

We formalise this by defining a projection operator P by 


(Pf) (u, y) = (2n)"!? exp (—4u*) J du flu, y). (6.4.25) 
where f(u, y) is an arbitrary function. The reader may easily check that 
PrP. (6.4.26) 


In terms of the vector space of all functions of u and y, P is an operator which pro- 
jects any vector into the subspace of all vectors which can be written in the form of 


g(u, y) = (2m)! exp (— hu’) &(y), (6.4.27) 
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where £(y) is an arbitrary function of y. However, functions of the form (6.4.27) 
are all solutions of 


Lig =0, (6.4.28) 


that is, the space into which P projects is the null space of L,. We may also note that 
in this case 


P= lim [exp (L,t)] . (6.4.29) 


To prove this, expand any function of uw and y in eigenfunctions P,(u) of LZ, as in 
Sect.5.2.5: 


fu, y) = 2 A,(y) P,(u), (6.4.30) 
where 

A,(y) = § du O,(u) flu, y). (6.4.31) 
Then 

lim [exp (Li t) flu, y)] = 34 Aa(y) lim e™ Pu) (6.4.32) 

= P,(u) f du Qo(u) flu, y) (6.4.33) 


and noting that for this process (Ornstein-Uhlenbeck) 
P,(u) = (22)7'”? exp (—}u?) (6.4.34) 
O.(u)=1. (6.4.35) 


We see that (6.4.29) follows. 
In this case and in all other cases, we also have the essential relation 


For, considering PL,Pf(u, y), we see from the definition of L,, that L,Pf(u, y) is 
proportional to 


u exp (—4u?) o P,(u) (6.4.37) 
and 


P P\(u) = 0 (6.4.38) 
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either by explicit substitution or by noting that P,(u) is not in the null space of L). 
Let us define 


v = Pp (6.4.39) 

w=(l — P)p (6.4.40) 
so that 

p=v+w 


and v is in the null space of L,, while w is not in the null space of L,. 
We can now note that, from (6.4.29), 


PL; =P =0 (6.4.41) 
and from the original equation we have 


0 
a7 = PlrLs + Lap 


= P(yL, + L,)[Pp + (1 — P)p] 
i PLI = P)p ’ 


[where we have used (6.4.41) and (64.36)] so that 


(6.4.42) 
= yL,(1 — P)p + (1 — P)LA — P)p + (1 — P)L,Pp (6.4.43) 
and using PL,P = 0, we have 
Ow 
ann yL.w —- ¢] ac P)L,w +- Liv . (6.4.44) 


or 


6.4.2 Solution Using Laplace Transform 


The fundamental equations (6.4.42, 44) can be solved in a number of iterative ways. 
However, since they are linear equations, a solution in terms of the Laplace trans- 
form can be very appropriate and readily yields a perturbation expansion. 
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The Laplace transform of any function of time f(r) is defined by 


f(s) = fe fir) (6.4.45) 


and it may quite readily be defined for operators and abstract vectors in exactly the 
Same way. Our fundamental equations become, on using 


j aa e = 5 f(s) — FO), (6.4.46) 


5 0(s) = PL,W(s) + v(0), 
5 W(s) = [yL, + (bk — P)L2]W(s) + L20(s) + w(0). 


(6.4.47) 


These equations are linear operator equations, whose formal solution is straight- 
forward. For simplicity, let us first assume 


w(0) = 0 (6.4.48) 
which means that the initial distribution is assumed to be of the form 

plu, y, 0) = (2n)"” exp (— hu’) p(y, 9), (6.4.49) 
that 1s, the initial thermalisation of the velocity u is assumed. Then we have formally 

w(s) = [s — yL, — CU — P)L,]7'L,0(s) (6.4.50) 


and hence, 


(6.4.51) 


We have here, at least formally, the complete solution of the problem. For any 
finite s, we can take the large y limit to find 


s 0(s) = —y'!PL,L;'L,0(s) + v(0). (6.4.52) 
Notice that L;! does not always exist. However, we know that 

PL,i(s) = PL,P p(s) = 0 (6.4.53) 
from (6.4.36). Hence, L,0(s) contains no component in the null space of ZL, and thus 
Ly 'L,0(s) exists. 


In this case of Kramers’ equation, let us now see what (6.4.52) looks like. It 1s 
equivalent to the differental equation 


a = — yp PL Ly'L,v . (6.4.54) 
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Now note that 
Lv = | fa u+ Uy) 2 P(u) { du’ plu’, y, t) (6.4.55) 
y u 
[with P,(u) defined by (6.4.34)]. 


For this problem is it useful to bring in the eigenfunctions of the Ornstein-Uh- 
lenbeck process [Sect.5.2.6c] which for these parameters, 


ID=k=1 (6.4.56) 


take the form 


P,(u) = (2m)! exp (—4u’)Q,(u) (6.4.57) 
with 

O,(u) = (27n!)7'7H,(u// 2 ) . (6.4.58) 
Using 

L,P,(u) = —nP,(u) (6.4.59) 
and the recursion formulae for Hermite polynomials 

xH,(x) = $Hyei(x) + ek (6.4.60) 

£ fe?) = — eH, (x). (6.4.61) 
we see that 

Lv = —[U'O) + F] Papo) (6.4.62) 
so that, using (6.4.59), 

PAD yi vo) + z P,(u)p(y). (6.4.63) 


We now apply L, once more and use the relations (6.4.60, 61) again. 
We find 


L,P,(u) = -|vz Pu) + Pa) - ~ fT Pu)U') (6.4.64) 


ap y—_—2ly oO 
PLLi'Lav = — 5 |U'») + 5 | p)Paw) (6.4.65) 
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and the equation of motion (6.4.54) takes the form [after cancelling the factor P,(u)] 


oP 1 0 


= ay (6.4.66) 


Up + 2) 


which is exactly the Smoluchowski equation, derived using the naive elimination 
given in Sect. 6.4. 


6.4.3. Short-Time Behaviour 


One should note that,the limit y —- co in (6.4.52) implies that s is finite, or roughly 
speaking 


y>s. (6.4.67) 
Thus, the solution for the Lapace transform will only be valid provided 

s<y (6.4.68) 
or for the solution, this will mean that 

t>y. (6.4.69) 
Let us define 

sSS=s" (6.4.70) 
so that (6.4.51) becomes 

ys,0 = PL [sy — Ly — (1 — P)L,]"'L,0 + v(0). (6.4.71) 
The limit y — oo gives 

ys,0 = y'PL(s, — L,)7'L,6 + v(0). (6.4.72) 
Using the fact that L,0 is proportional to P,(u) (6.4.62), we see that 

ys;d = y(s, + 1)7'PL206 + (0). (6.4.73) 


Changing back to the variable s again and rearranging, we find 
-1 
6 = yy (— -: 1 PL35 + v(0) (6.4.74) 


which is equivalent to 


oY fd dt’ exp [p(t’ — t)]PL3v(t')dt . (6.4.75) 
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Alternatively, we can rewrite (6.4.74) in the form 
J [526 — sv(O)] + [98 — (0)] = PLD (6.4.76) 


which, using (6.4.46) and the same working out as in Sect. 6.4.2, is equivalent to the 
equation for p: 


(6.4.77) 


in which the initial condition 
OB ay _ 
ot Oa 
is implied because 
fe“frt) = ° f6) — sf) —f'O) 
0 


and no constant term appears in the first bracket in (6.4.76). We may smiilarly 
rewrite (6.4.75) or integrate (6.4.77) to get 


ap a 


Se = je UO) + 3 fa exp exp (nt! — NIAC") 


- (6.4.78) 


Equations (6.4.77, 78) demonstrate a non-Markov nature, seen explicitly in (6.4.78) 
which indicates that the prediction of f(t + At) requires the knowledge of A(t’) for 
0 <t' < t. However, the kernel exp [y(t’ — £)] is significantly different from zero 
only for |t’ — t| ~ y"' and on a time scale much longer than this, (6.4.78) is 
approximated by the Smoluchowski equation (6.4.66). Formally, we achieve this by 
integrating by parts in (6.4.78): 


\— p(t)—e~""p(0) — 


a ’ 7] ’ 
y fexp [(t" — nS? dt _ (6.4.79) 


| dt’ exp[y(t! — 11 p(t 


Neglecting the last term as being of order y~’, we find the Smoluchowski equation 
replaced by 


Flu + Z| oo —e"90)| (6.4.80) 


This equation is to lowest order in y equivalent to (6.4.78) for all times, that is, very 
short (<y') and very long (>y7') times. It shows the characteristic “memory 
time’’, y~', which elapses before the equation approaches the Smoluchowski form. 
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To this order then, the process can be approximated by a Markov process, but 
to the same order, there are the alternative expressions (6.4.77, 78) which are not 
Markov processes. Clearly, to any higher order, we must have a non-Markov pro- 
ces. 


6.4.4 Boundary Conditions 


Let us consider a barrier to be erected at y = a with the particle confined to y < a. 
The behaviors for u > 0 and u < O are distinct. 
From the stochastic differential equations 


eee : (6.4.81) 
du = —[U"(y) + yuldt + ./2y dW(t) 
We see that 


foru > 0, y = ais anexit boundary 


for u < 0, y = ais anentrance boundary. 


since a particle with u > O at y = a must proceed to y > or be absorbed. Simi- 
larly, particles to the left of the boundary can never reach y = aif u < 0. Conven- 
tionally, we describe y = a as absorbing or reflecting as follows: 

1) absorbing barrier: particle absorbed for u > 0, no particles with u < 0 


> p(u,a,t)=0 u>O0 (6.4.82) 
=0 u<0 

The first condition is the usual absorbing boundary condition, as derived in Sect. 

5.2.1. The second condition expresses the fact that any particle placed with u < 0 


at y = a immediately proceeds to y < a, and no further particles are introduced. 
The absorbing barrier condition clearly implies 


p(a, t) =0 (6.4.83) 


which is the usual absorbing barrier condition 1n a one variable Fokker-Planck 
equation. 

ii) “reflecting” barrier: physically, a reflection at y = a implies that the particle 
reaches y = a with the velocity u, and is immediately reflected with a different 
but negative veloicty. If we assume that 


u—-—uU, 


then we have a “‘periodic boundary condition” as in Sects. 5.2.1 and 5.3.2 This 
means that 


plu, a, t) = p(—4, a, t) (6.4.84) 
and that the normal component of the current leaving at (u, a) is equal to that 
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entering at (—wu, a). However, since for this equation 
op 
J = j—up, [yu+ U(y)]) p + Yar (6.4.85) 


and the normal component is the component along the direction y, we see that 
(6.4.84, 85) are equivalent. 

This boundary condition leads very naturally to that for the Smoluchowski 
equation. 

(For (6.4.84) implies that only even-order eigenfunctions P,(u) can occur in 
the expansion of p(u, a, t). Hence, 


(1 — P)p(u, a, t) = wy, a, t) (6.4.86) 


contains only even eigenfunctions, and the same is true of W(u, a, s), the Laplace 
transform. But, from (6.4.50) we see that to lowest order in y™! 


w(u, a, Ss) = (—y*L"L,v)(u, a, 5) (6.4.87) 


and using (6.4.63) 
= = Py) + BO) (6.4.88) 


Since this is proportional to the od eigenfunction P,(u), it vanishes. Hence we 
derive 


ee F — =f (6.4.89) 


which is the correct reflecting barrier coundary condition for the Smoluchowski 
equation. 

It is not difficult to show similarly that the same boundary conditions can be 
derived for the equations derived in Sect.6.4.3. 


6.4.5 Systematic Perturbative Analysis 


We return now to (6.4.47), but again with w(0) = 0 for simplicity. 
Then we have 


w(s) = [s — yL, — (1 — P)L,]'L,0(s) (6.4.90) 
and 
s 0s) = PL,[s — yL, — (1 — P)L,J-'L,0(s) + v(0) . (6.4.91) 


We can straightforwardly expand the inverse in (6.5.91) in powers of y. However, 
the order of magnitude of s in this expansion must be decided. From the previous 
sections we see that there is a possibility of defining 
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Soy B— yO (6.4.92) 
which gives rise to an expansion in which the initial time behaviour in times of 


order y~! is explicitly exhibited. 
An alternative is to define 


So = SY, v— yo (6.4.93) 


which we will find yields an expansion in which only the long-time behaviour is ex- 
hibited. 


By substituting (6.4.93) we find that (6.4.91) takes the form 

5,0 = —PLJL, + (1) — P)Lyy! — 5,977] L,6 + v(0) (6.4.94) 
and we may now write 

syd = — PL,L{'L,0 + v(0) (y — oo) (6.4.95) 
whereas, without this substitution, the limit y —~ co yields simply 

sv = v(0). (6.4.96) 
This, is in fact, in agreement with (6.4.52), where we did not actually go to the 


limit y — co. 
The substitution of s; = sy7}, 


5,0 = y*PL,[s, — L, — (1 — P)L,y™]7 | L26(s) + v(0) (6.4.97) 


does not lead to a limit as y —- co. However, it does yield an expansion which 1s 
similar to an ordinary perturbation problem, and, as we have seen, exhibits short- 
time behaviour. 


a) Long-Time Perturbation Theory 
We can expand (6.4.94) to order y~? in the form 
syv = [A + By" + (C + Ds2)y-7]0 + v0) (6.4.98) 
with 
A= =PL Eis 
B = PL,L{ C1 == P)LLLT LE, 
C= —PL,L (1 — P)L,L (1 — P)L,Ly'L, 
D=—PL,L;7L,. 


(6.4.99) 


Rearranging (6.4.98) we have 


s(1 — y-?D)o = [A + By + Cy ]o + (0) (6.4.100) 
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or, to order y~?, 
5,0 =[A + By! + (C+ DA)y~7]o + Cl + y7?7D)v(0) . (6.4.101) 


This gives the Laplace transform of an equation of motion for v, in which the initial 
condition is not v(0), but (1 + y~*D)v(0). Notice that this equation will be of first 
order in time, since s3 for n > | does not occur. This is not possible, of course, for 
an expansion to higher powers of y. 


b) Application to Brownian Motion 
For Brownian motion we have already computed the operator A of (6.4.99); it is 
given by (6.4.65) 


—PLLy'L, = 2 uy) + x (6.4.102) 


dy 


The other operators can be similarly calculated. 
For example, from (6.4.63, 64), 


LqLi'La = — [Pala 5 + VT Pale) [Uo - | uo) +5]. 


Multiplying by (1 — P) simply removes the term involving P,, and multiplying by 
L;! afterwards multiplies by —4. Hence, 


2 


Ll — P)L,LzL, = /T Plu) UO) + 2 


We now use the Hermite polynomial recursion relations (6.4.60, 61) when multi- 
plying by L,: we derive 


L,P(u) = | oe + U"(y) z P,(u) (6.4.103) 


= —./3 U'(y)P3(u) = |v F Ps as 2 Pi). 


Finally, multiplying by P annihilates all terms since P,(w) does not occur. Hence, 


B= PL,L7\(1 — P)L,Lz'L, = 0. (6.4.104) 


The computation of C and DA follow similarly. 
One finds that 


02 


C= ay U" y) + 5 [UO “e > (6.4.105) 


and 
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— _ Of yy 0) 9/7, oO 
DA = — 5 (U (y) + * 5,[U (y) + a (6.4. 106) 
so that 
C+ DA = a U"(y) ie +. 4 (6.4.107) 
ay By 4, 


and (6.4.101) is equivalent to the differential equation 


Op — awl oO “a —277"(,) / es dl 
Paris $M Vore+F (6.4.108) 
with the initial condition 


tim p(y, t) = fl — 7-4 5 |U'O) ip | aly, 0). (6.4.109) 


This alteration of the initial condition is a reflection of a “‘layer’’> phenomenon. 
Equation (6.4.108) is valid for t > y~!' and is known as the corrected Smoluchowski 
equation. 

The exact solution would take account of the behaviour in the time up tot ~ y7! 
in which terms like exp (—yt) occur. Graphically, the situation is as in Fig. 6.1. the 
changed initial condition accounts for the effect of the initial layer near t ~ y7!. 


ply,t) 


Fig. 6.1. Formation of a layer at a boundary. The exact 
solution (——) changes rapidly near the boundary on 
the left. The approximation (------ ) is good except near 
the boundary. The appropriate boundary condition for 
the approximation is thus given by the smaller value, 
where the dashed line meets the boundary 


(c) Boundary Conditions 

The higher order implementation of boundary conditions cannot be carried out by 
the methods of this section, since a rapidly varying layer occurs in the variable x 
near the boundary, and the assumption that the operator 0/dy is bounded becomes 
unjustified. Significant progress has been made by Titulaer and co-workers 
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[6.10-—12]. Suppose the boundary is at y=0, then we can substitute z= yy, t= yt 
into the Kramers equation (6.4.19) to obtain 


oP 7) n 0 7) 
——}=$_ = —— —|]- u— 
at |dau\" du} — dz 
so that the zero order problem is the solution of that part of the equation 
independent of y: to this order the potential is irrelevant. Only the stationary 


solution of (6.4.110) has so far been amenable to treatment. It can be shown that 
the stationary solution to the y — oo limit of (6.4.110) can be written in the form 


| oP 
P+— U'(z/y) — 4. 
; (2/7) rr (6.4.110) 


P(u, z) = wo(u, z) + dw (u, z) + > ds y,(u, Z) (6.4.111) 
where - 

Wo (u, z) =(2 2)? exp(— 5 u?) (6.4.112) 

wo (u, z) =(2 2) 1/7 (z — u) exp(— 5 us?) (6.4.113) 


and the y,(u, z) are certain complicated functions related to Hermite polynomials. 
The problem of determining the coefficients d= is not straightforward, and the 
reader is referred to [6.12] for a treatment. It is found that the solution has an 
infinite derivative at z = 0, and for small z is of the forma + bz!” . 

3 
6.5 White Noise Process as a Limit of Nonwhite Process 
The relationship between real noise and white noise has been mentioned previously 
in Sects.1.4.4, 4.1. We are interested in a limit of a differential equation 


a = a(x) + b(x)a,(t) , (6.5.1) 


where a,(t) is a stochastic source with some nonzero correlation time. We will show 
that if a(t) is a Markov process, then in the limit that it becomes a delta correlated 
process, the differential equation becomes a Stratonovich stochastic differential 
equation with the same coefficients, that is, it becomes 


(S) dx = a(x)dt + b(x)dW(t) (6.5.2) 
which is equivalent to the Ito equation 
dx = [a(x) + 4b(x)b’(x)]dt + b(x)dW(t) . (6.5.3) 


To achieve the limit of a delta correlated Markov process, we must take the large 
y limit of 


a(t) = ya(y*t) , (6.5.4) 
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where a(t) is a stationary stochastic process with 


<a(t)> = 0 (6.5.5) 

a(t)a(0)>, = g(t). (6.5.6) 
Then, 

<a(t)> = 0 

(a(t )ao(0)>, = y’8(7"t) . (6.5.7) 


In the limit of infinite », the correlation function becomes a delta function. More 
precisely, suppose 


f g(t)dt = | (6.5.8) 
and 
f |t| g(t)dt = ft, (6.5.9) 


defines the correlation time of the process a(t). [If g(t) is exponential, then t, as 
defined in (6.5.9) requires that 


g(t) oc exp (—1/t,) 


which agrees with the usage in Sects.1.4.4, 3.7.1.] 
Then clearly, the correlation time of a(t) is t,/y?, which becomes zero as y —- 00; 
further, 


lim a,(t)a,(0)), =0 (t #0). (6.5.10) 


But at all stages, 


J <as(t)ag(0)> «at = f aor =5 (6.5.11) 
so that we can write 
lim {a(t )a(0)>, = d(t). (6.5.12) 


Therefore, the limit y — co of a@,(t) does correspond to a normalised white noise 
limit. The higher-order correlation functions might be thought to be important too, 
but this turns out not to be the case. 

We will give a demonstration in the case where a(t) is a Markov diffusion pro- 
cess whose Fokker-Planck equation is 
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P@) _ _ 9 t4(ap(ay) + + S1B@ pa. 


This means that the FPE for the pair (x, a) is 


2p) — (y°L +- yL, + L;)p(x, a) 


with 


Ie 


3 
b= —x A) + 3 5p 


B(a) 
0 


0 
L, == ay al) . 


(6.5.13) 


(6.5.14) 


(6.5.15) 


The asymptotic analysis now proceeds similarly to that used in Sects.6.4.1, 6.4.3, 
with a slight modification to take account of the operator L;. Analogously to Sect. 


6.4.1, we define a projector P on the space of functions of x and @ by 


(Pf)(x, a) = pa) J da f(x, a), 


¢ 
where p,(a) is the solution of 


Lp(a) = 0 * 


(6.5.16) 


(6.5.17) 


We assume that in the stationary distribution of a, the mean (a), is zero. This 


means that the projection operator P satisfies the essential condition 


PL,P=0. 


Since 


(PLaPf) (x a) = pala) { da] — F B(x)apsa)}f da’ fx, a’) 


= —p,(a)<a), 2. b(x) { da’ f(x, a’) =0. 


Also, it is obvious that 
PL; = LP 
and, as before, 


PL,=L,P=0. 


(6.5.18) 


(6.5.19) 


(6.5.20) 


(6.5.21) 
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Defining, as before 
v = Pp (6.5.22) 
w=(l — P)p (6.5.23) 
and using the symbols #, # for the corresponding Laplace transforms, we find 


5 O(s)= P(yL, + pL, + Ls)p(s) + v(0) (6.5.24) 
= yPL,[Pp(s) + (1 — P)p(s)] + LP p(s) + v0) 


so that 


s 0(s) = yPL,W(s) + L30(s) + v(0) (6.5.25) 


and similarly, 


s Ws) = [y°L, + yU — P)L, + L3]W(s) + yL,0(s) + w(0) (6.5.26) 


which differ from (6.4.47) only by the existence of the L,0(s) term in (6.5.25) and 
LW in (6.5.26). We again assume w(0) = 0, which means that a(t) is a stationary 
Markov process, so that 


sv (s) = L,o(s) — yPL—s + y*-L, + pd — P)L, + L,]7'pL2i(s) + v(0). (6.5.27) 


Now the limit y — co gives 
si(s) = (L3 — PL,L{!L,)i(s) + v(0). (6.5.28) 
We now compute PL,L;'L,0. We write 
0(S) = p(x)p,(a@) (6.5.29) 
PL,L{'L,6 = pa) | da’ | 2 o(x)a’ Li | = b(x)a’| p(a’)p(x). (6.5.30) 
We now need to evaluate 


{ da aL;'ap(a) = —D (6.5.31) 


and to do this we need a convenient expression for L;!. Consider 
f exp (L,t')dt' = Ly! exp (L,t) — Lj! (6.5.32) 
0 


and using (6.4.29) 
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fexp (Lyt) dt = —Lz1 — P). 


Since, by assumption, 


Papa) = p,(a) J da’a'p(a’) = p,(a’)<a>, = 0 


we have 
D = [ daaf exp (L,t)ap(a) dt 
0 


We note that 
exp (L,f)ap,(a) 
is the solution of the Fokker-Planck equation 


0,f= Lif 


with initial condition 


f(a, 0) = ap,(a). 


Hence, 4 


exp (L,t)ap(a) = { da'p(a, t|a’, 0)a'p,(a’) 


and substituting in (6.5.35), we find 
Dp = dt {da da! aa'p(a’, t|a, 0)p,(a) , 
tf) 


1.€., 


D= i dt <a(t)a(O)>, 


and from (6.5.8) and the symmetry of the correlation function, 


D = 1/2. 


Using (6.5.42) as the value of D, we find 


— PLaLi'Lab = psa) 3] BO) FB) 


so that the differential equation corresponding to (6.5.28) for 


(6.5.33) 


(6.5.34) 


(6.5.35) 


(6.5.36) 


(6.5.37) 


(6.5.38) 


(6.5.39) 


(6.5.40) 


(6.5.41) 


(6.5.42) 


(6.5.43) 
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p(x, t) = f da p(x, a) (6.5.44) 
iS 

Oop _— a . « 

BF yA B®) +z 2 (x) o x OO) BOX) - (6.5.45) 


This is, of course, the FPE in the Stratonovich form which corresponds to 
(S) dx = a(x)dt + b(x)dW(t) (6.5.46) 
or which has the Ito form 


= [a(x) + 4b’(x)b(x)]dt + b(x)dW(t), (6.5.47) 


as originally asserted. 
6.5.1 Generality of the Result 


A glance at the proof shows that all we needed was for a(t) to form a stationary 
Markov process with zero mean and with an evolution equation of the form 


2R@) hati): (6.5.48) 


where L, is a linear operator. This is possible for any kind of Markov process, 
in particular, for example, the random telegraph process in which a(t) takes on 
values --a. In the limit y — oo, the result is still a Fokker-Planck equation. This is 
a reflection of the central limit theorem. For, the effective Gaussian white noise 1s 
made up of the sum of many individual components, as y — co, and the net 
result is still effectively Gaussian. In fact, Papanicolaou and Kohler [6.7] have rigoro- 
usly shown that the result is valid even if a(t) is a non-Markov process, provided it 
is “strongly mixing” which, loosely speaking, means that all its correlation func- 
tions decay rapidly for large time differences. 


6.5.2 More General Fluctuation Equations 


Notice that in (6.5.1), instead of defining a,(t) as simply ya(t/y*), we can use the 
more general form 


a(t, x) = pyle, a(t/y*)] (6.5.49) 


and now consider only b(x) = 1, since all x dependence can be included in y. 
We assume that 


[ da w(x, a)p(a) = 0 (6.5.50) 


in analogy to the previous assumption ¢a>. = 0. 
@ 
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Then D becomes x dependent, and we have to use 
D(x) = f dt <wlx, a(t)w[x, a(0)])> (6.5.51) 
8) 
and 
E(x) = { dt (oe [x, a(t)]vix, @(0)] (6.5.52) 
8) 
and the Fokker-Planck equation becomes 


PS tax) + EQ} 01 + S(DWpI. (6.5.53) 


In this form we have agreement with the form derived by Stratonovich [Ref. 6.3, 
Eq.(4.4.39)]. 


6.5.3. Time Nonhomogeneous Systems 


If instead of (6.5.1) we have 
dx . 
a a(x, t) + B(x, t)a,(t) , § (6.5.54) 


the Laplace transform method cannot be used simply. We can evade this difficulty 
by the following trick. Introduce the extra variable t so that the equations become 


dx = [a(x, t) + yb(x, t)a]dt (6.5.55) 
da = y’A(a)dt + y./B(a) dW(t) (6.5.56) 
dr =adt. (6.5.57) 


The final equation constrains ¢ to be the same as t, but the system now forms a 
homogeneous Markov process in the variables (x, a, t). Indeed, any nonhomo- 
geneous Markov process can be written as a homogeneous Markov process using 
this trick. 

The Fokker-Planck equation is now 


PPL, + pla + Ly (6.5.58) 


with 


a 1 3 
L, = — = A(a) + > 55 Bla) (6.5.59) 
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7) 

L, = — 5 B(x, t)a (6.5.60) 
0 060(~ 

L,; = — a7 — 9, as T). (6.5.61) 


Using the same procedure as before, we obtain 


oe | 2 2 ax, +5 EO DE Mx, |p (6.5.62) 


which yields 
dt = dt (6.5.63) 


so that we have, after eliminating t in terms of f, 


— Zale, t) + xe He, 1) 2 W(x, |p (6.5.64) 
in exact analogy to (6.5.45). 
6.5.4 Effect of Time Dependence in L, 
Suppose, in addition, that A and B depend on time as well, so that 
L=- 2 A(a, t) + Eas B(a, T) . (6.5.65) 
da 2 0a’ 


In this case, we find P is a function of t and hence does not commute with L,. Thus, 
Pie isk . (6.5.66) 


Nevertheless, we can take care of this. Defining 0(s) and W(s) as before, we have 


5s Ws) = P(yL, + L;)W(s) + PL,O(s) + v(0) (6.5.67) 
sW(s) = [y7L, + yl — P)L, + (1 — P)L3)W(s) + yL20(s) + (1 — P)E30(s) 
(6.5.68) 
so that 
s Ws) = PL;0(s) + P(yL, + L,)[s — WL, — yd — P)L, — (1 — P)L,J"' 
x [yL, + (1 — P)L3]0(s) + (0). (6.5.69) 


We see that because L, is multiplied by y and ZL; is not, we get in the limit of large y 


cals) ~ (PT. — PITTA Nis) + of (6.5.70) 
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In this case we will not assume that we can normalise the autocorrelation function 
to a constant. The term — PL,L;'L, gives 


7) 7) - 

ax b(x, T) ay B(x, T) J dt<a,(t)a,(0)> , (6.5.71) 
where by a,(t) we mean the random variable whose FPE is 
op _| 9 1 a 
ees Ee A(a, tT) + > 3a B(a, |p , (6.5.72) 


Thus, the limit y — co effectively makes the random motion of a infinitely faster 
than the motion due to the time dependence of a arising from the time dependence 
of A and B. Defining 


D(t) = f dt{a,(t)a,(0)) (6.5.73) 


we find, by eliminating t as before, 


y= |— 5x ae 1) + DI) bx, FZ oC, |p. (6.5.74) 
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We now want to consider the general case of two variables x and @ which are 
coupled together in such a way that each affects the other. This is now a problem 
analogous to the derivation of the Smoluchowski equation with nonvanishing 
V(x), whereas the previous section was a generalization of the same equation 
with V'(x) set equal to zero. 

The most general problem of this kind would be so complex and unwieldy as to 
be incomprehensible. In order to introduce the concepts involved, we will first con- 
sider an example of a linear chemical system and then develop a generalised theory. 


6.6.1 Example: Elimination of Short-Lived Chemical Intermediates 


We consider the example of a chemically reacting system 


y k 
X¥=——Y=A (6.6.1) 
k y 


where X and Y are chemical species whose quantities vary, but A is by some means 
held fixed. The deterministic rate equations for this system are 


AW Hxty (6.6.2a) 
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d 
oA = —2yy+x+a. (6.6.2b) 


Here x, y, a, are the concentrations of X, Y, A. The rate constants have been cho- 
sen so that kK = I, for simplicity. 

The physical situation is often that Y is a very short-lived intermediate state 
which can decay to X or A, with time constant y~'. Thus, the limit of large y in 
which the short-lived intermediate Y becomes even more short lived, and its 
concentration negligible, is of interest. This results in the situation where we solve 
(6.6.2b) with dy/dt = 0 so that 


y = (x + a)/27, (6.6.3) 


and substitute this in (6.6.2a) to get 


dx x a 
an eS (6.6.4) 


The stochastic analogue of this procedure is complicated by the fact that the white 
noises to be added to (6.6.2) are correlated, and the stationary distribution of y 
depends on y. More precisely, the stochastic differential equations corresponding 
to (6.3.2) are usually chosen to be (Sect.7.6.1) 


dx = (—x + yy)dt + €B,,dW,(t) + ¢B,.dW(t) (6.6.5) 
dy = (—2yy + x + a)dt + &B,,dW,(t) + &B.,dW,(t), 


where the matrix B satisfies 


(6.6.6) 


2a —2a 
BBTt = | 


—2a 4a 


Here € 1s a parameter, which is essentially the square root of the inverse volume of 
the reacting system and is usually small, though we shall not make use of this fact 
in what follows. 

We wish to eliminate the variable y, whose mean value would be given by (6.6.3) 
and becomes vanishingly small in the limit. It is only possible to apply the ideas we 
have been developing if the variable being eliminated has a distribution function 
in the stationary state which is independent of y. We will thus have to define a 
new variable as a function of y and x which possesses this desirable property. 

The Fokker-Planck equation corresponding to (6.6.5) is 


Op_|9¢. O 2g —-22- 2 
at Fac 0 aa aaa 
+ Oy — x — a) + 20a 53 |P (6.6.7) 
oy oy* |" - 


It seems reasonable to define a new variable z by 
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z=2yy—x—a (6.6.8) 


which is proportional to the difference between y and its stationary value. Thus, we 
formally define a pair of new variables (x,, z) by 


X, =x Le (6.6.9) 
z =2y—x-—a y=(e+-x, + a)/2y 


so that we can transform the FPE using 


@_¢a0 = 4 
ae (6.6.10) 
a a 
ay 7? 2 
to obtain 
Op_({|9 |x—a@ Zz 2, O- a a ee 2 
1 =|5-| 2 5 | + aga + Zea pee tale) 
F) Xx—a 2 Q* 
te (22 Se ay ce 2) + ay (8e*y’a + Ba + aye%a) |p (6.6.11) 


The limit of y — co does not yet give a Fokker-Planck operator in z, which is simply 
proportional to a fixed operator: we see that the drift and diffusion terms for z 
are proportional to y and y’, respectively. 

However, the substitution 


a=2zy'" (6.6.12) 


changes this. In terms of a, the drift and diffusion coefficients become proportional 
to y and we can see (now writing x instead of x,) 


0 
5p = Lehr + Lay) + Lalp (6.6.13) 
in which 
2 
L, =2 Sa aL Beta 2 (6.6.14) 


as id Dr x. =1/2 32 re. -9/x—a 
Li) = 5. | 48a 5 ta] + 4y "aa aal (2 


2 


0 0 Ch 
eva liz ay 1D p2 —3/2,2,7 7 
+{ jas age 26°a + y¥ ea a3 


Oxda 


(6.6.15) 


Q? 
ae ! 297 — 
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Notice that L(y) has a large y limit given by the first term of the first line of 
(6.6.15). The only important property of L, is that PL,P = 0. Defining P as usual 


Pf(x, @) = pa) f da’ f(x, a’) (6.6.17) 
where p,(q@) is the stationary solution for L,, we see that for any operator beginning 


with 0/da@ [such as the y dependent part of L,(y)] we have 


P £ f = p,(a) { do’ 2 fix, oc’) = 0, (6.6.18) 


provided we can drop boundary terms. Hence, the y dependent part of L(y) satis- 
fies PL,P = 0. Further, it is clear from (6.6.14) that <@>, = 0, so we find that the 
first, y independent part, also satisfies this condition. Thus 


PLAy)P =0. (6.6.19) 


Nevertheless, it is worth commenting that the y dependent part of L(y) contains 
terms which look more appropriate to Z,, that is, terms not involving any x deriva- 
tives. However, by moving these terms into L,, we arrange for L, to be independent 
of y. Thus, P is independent of y and the limits are clearer. 

The procedure is now quite straightforward. Defining, as usual 


Pp(s) = Ws) (6.6.20) 
(1 — P)p(s) = Ws) 


and assuming, as usual, w(0) = 0, we find 


5 Os) = PlyL, + y'7Ly) + Ls] [O(s) + W(s)] + v0) (6.6.21) 
and using 
PE = LP =0 
PL,P =0 (6.6.22) 
PL; = Lisl’; 
we obtain 
s Ws) = Py''*L(y)W(s) + L30(s) + v0) (6.6.23) 


and similarly, 
Ss Ws) = [yL, + 71 — P)LACy) + L3]W(s) + 9" L2(y)0(s) (6.6.24) 


so that 
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s Os) = {Ls + yPLAy)[s — yL, — "71 — P)LAy) — L3)-'L} o(s) 
+ v(0). (6.6.25) 


Now taking the large y limit, we get 


si(s) = (L; — PL,L;'L,)i(s) + v(0), (6.6.26) 
where 
I= lim 10) = 2 (- Met ce sa (6.6.27) 
Yoo : Ox Oa ° aa 


Equation (6.6.26) is of exactly the same form as (6.5.28) and indeed the formal 
derivation from (6.6.7) is almost identical. The evaluation of PL,L;'L, is, however, 
slightly different because of the existence of terms involving 0/da. Firstly, notice that 
since P d/da = 0, we can write 


0 7) 0 ' \ 
—PL,L;'L,v = —p,a) | da’ [— 4a ra Ly" ax (—4era — fa’) p,(a’)p(x) 
(6.6.28) 


and from the definition of p,(@) as satisfying L,p,(@) = 0, we see from (6.6.14) that 


é. pa) = —ap,(a)/4e’a a (6.6.29) 
and hence that 


2 
—PL,Ly'L.v = + p;(@) & f da’ a’ Ly 'a'p,(a’) p(x) (6.6.30) 


= — f pay? PO | F dt <a(t)a(0)>, (6.6.31) 


where we have used the reasoning given in Sect.6.5 to write the answer in terms 
of the correlation function. Here, Z, is the generator of an Ornstein-Uhlenbeck 
process (Sect 3.8.4) with k = 2, D = 16¢’a, so that from (3.8.2), 


] 


—PL,L;'L, = — — pa) —> = a ) 4e’a f dte~*! (6.6.32) 
0 


4P 
|. 29? ca 
~5e (2). 


Hence, from (6.6.26), the effective Fokker-Planck equation is 


(6.6.33) 
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Comments 
i) This is exactly the equation expected from the reaction 


k/2 
X¥—A (6.6.34) 
k/2 


(with k = 1) (Sect.7.5.3). It is expected because general principles tell us that the 
stationary variance of the concentration fluctuations is given by 


var {x(t)}, = &’ <x), . (6.6.35) 


11) Notice that the net effect of the adiabatic elimination 1s to reduce the coefficient 
of 07/dx’, a result of the correlation between the noise terms in x and y in the ori- 
ginal equations. 

iii) This result differs from the usual adiabatic elimination in that the noise term 
in the eliminated variable is important. There are cases where this is not so; they 
will be treated shortly. 


6.6.2 Adiabatic Elimination in Haken’s Model 
Haken has introduced a simple model for demonstrating adiabatic elimination 


[Ref. 6.1, Sect.7.2]. The deterministic version of the model is a pair of coupled 
equations which may be written 


x = —éx — axa (6.6.36) 
a = —ka + bx’. (6.6.37) 


One assumes that if x is sufficiently large, we may, as before, replace a by the 
Stationary solution (6.6.37) in terms of x to obtain 


a= —>x? (6.6.38) 


eam ts, oe, ee ead 
X= —EX — x". (6.6.39) 


The essential aim of the model is to obtain the cubic form on the right-hand side of 
(6.6.39). 

In making the transition to a stochastic system, we find that there are various 
possibilities available. The usual condition for the validity of adiabatic elimination 
1S 


E<K. (6.6.40) 
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In a stochastic version, all other parameters come into play as well, and the condi- 
tion (6.6.40) is, in fact, able to be realised in at least three distinct ways with 
characteristically different answers. 

Let us write stochastic versions of (6.6.36, 37): 


dx = —(ex + axa)dt + C dW,(t) 


(6.6.41) 
da = (—Ka + bx*)dt + D dW,(t) 


and we assume here, for simplicity, that C and D are constants and W,(t) and W,(t) 
are independent of each other. 
The Fokker-Planck equation is 


ap_[a i ee 
= Ee (ex + axa) +4C ay? + Aa (xa — bx*) + 4D asl? (6.6.42) 
We wish to eliminate a. It is convenient to define a new variable B by 

b 


so that, for fixed x, the quantity # has zero mean. In terms of this variable, we can 
write a FPE: 


cP (19 + L9 + L9)p ; (6.6.44) 
Lo = arb fe oF (6.6.45) 
Ly = sax — pox ae fe oe ze axp) 

aur a a + a ar OS) 
Lg = Fe (ex 4 2 x'| a e| (6.6.47) 


In terms of these variables, the limit ¢ —- 0 1s not interesting since we simply get the 
same system with e = 0. No elimination 1s possible since Z, is not multiplied by a 
large parameter. 

In order for the limit e —- 0 to have the meaning deterministically that (6.6.39) 
is a valid limiting form, there must exist an A such that 


—=¢éA, as e—O0. (6.6.48) 
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For this limit to be recognisable deterministically, it must not be swamped by 
noise so one must also have 


C2 


5 EB as €—0 (6.6.49) 


which means, as ¢ — 0 


ea a ee 
o— Ee aX + Ax’) + Bos (6.6.50) 


Ox? 


However, there are two distinct possibilities for L$. In order for L$ to be inde- 
pendent of é, we must have x independent of e, which is reasonable. Thus, the 
limit (6.6.48) must be achieved by the product ab being proportional to e. We con- 
sider various possibilities. 


a) The Silent Slave: a Proportional to ¢ 
We assume we can write 


a= é&. (6.6.51) 


We see that L° is independent of ¢ while L$ and L§ are proportional to e. If we 
rescale time by 


=e (6.6.52) 
then 

0 I 

Palit piety, (6.6.53) 
where 

h=L 

L, = Le (6.6.54) 

L; = TS/e . 


Clearly, the usual elimination procedure gives to lowest order 


(6.6.55) 


7 a pe 
ri P= |S(x + Ax) + BSS|p 


since L, does not become infinite as e — 0. 

This corresponds exactly to eliminating @ adiabatically, ignoring the fluctuations 
in a and simply setting the deterministic value in the x equation. I call it the ‘silent 
slave’, since (in Haken’s terminology) a@ is slaved by x and makes no contribution 
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to the noise in the x equation. This is the usual form of slaving, as considered by 
Haken. 


b) The Noisy Slave: a Proportional to ¢'’? 
If we alternatively assume that both a and b are proportional to e'/?, we can write 


a= @e'!? (6.6.56) 
b = be!!2 , 

where 
Gh = KA. (6.6.57) 


I stays constant, L$ is proportional to e and 


L9 = e'/*L, + higher order terms in ¢, (6.6.58) 
where 
ee | ae (6.6.59) 
2= GPa x. 6. 


Thus, the limiting equation is 


7) * 
ce = (L, — PL,Lz'L,)p . ‘ (6.6.60) 
The term PL,L;!L, can be worked out as previously; we find 
D>? da 
= a4 wee re eee 
PEL ls 4 at ax ax (6.6.61) 
SO 
aLT)y2 2 
| (6.6.62) 


I call this the “‘noisy slave’’, since the slave makes his presence felt in the final 
equation by adding noise (and affecting the drift, though this appears only in the Ito 
form as written; as a Stratonovich form, there would be no extra drift). 


c) The General Case 

Because we assume ab « e, it can be seen that the second two terms in (6.6.46) are 
always proportional to €?, where p > I, and hence are negligible (provided b is 
bounded). Thus, the only term of significance in L9 is the first. Then it follows that if 


| a~4 


a=é€@a, 


6.6 Adiabatic Elimination of Fast Variables: The General Case 227 


we have the following possibilities: 


r >: no effect from L,: limiting equation is (6.6.55), 
r = }: limiting equation is (6.6.62)—a noisy slave, 
r< 4: the term PL,L;7'L, becomes of order €?”~! —-~ co and is dominant. 


The equation is asymptotically (for r < 4) 


Op wig D (do 
re a Fee = x] p. (6.6.63) 


These are quite distinct differences, all of which can be incorporated in the one 
formula, namely, in general 


Op _ 


5 (6.6.64) 


S De rex 
3 2r—-1G 


In applying adiabatic elimination techniques, in general, one simply must take par- 
ticular care to ensure that the correct dependence on small parameters of all 
constants in the system has been taken. 


6.6.3 Adiabatic Elimination of Fast Variables: A Nonlinear Case 


We want to consider the general case of two variables x and @ which are coupled 
together in such a way that each affects the other, though the time scale of a is 
considerably faster than that of x. 

Let us consider a system described by a pair of stochastic differential equations: 


dx = [a(x) + b(x)aldt + c(x)dW,(t) (6.6.65) 
da = y"[A(a) — f(x)]dt + y/2B(a) Wt) . (6.6.66) 


If we naively follow the reasoning of Sect.6.4, we immediately meet trouble. For in 
this limit, one would put 


A(a) — f(x) = - /4@ eu(a) an) (6.6.67) 


on the assumption that for large a, (6.6.66) is always such that da/dt = 0. But then 
to solve (6.6.67) for @ in terms x yields, in general, some complicated nonlinear 
function of x and dW,,(t)/dt whose behaviour is inscrutable. However, if B(q) is zero, 
then we can define u(x) to be 


A[uo(x)] = f(x) (6.6.68) 


and substitute in (6.6.65) to obtain 
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dx = [a(x) + b(x)uo(x)]dt + c(x)dW(t). (6.6.69) 


We shall devise a somewhat better procedure based on our previous methodology 
which can also take the effect of fluctuations in (6.6.66) into account. 
The Fokker-Planck equation equivalent to (6.6.65, 66) is 


 _ (PL, + Ln + Lip, (6.6.70) 
where 
EL = ((-Aela = 8@) (6.6.71) 
1 0a 0a? = 


and L, and L, are chosen with hindsight in order for the requirement PL,P = 0 
to be satisfied. Firstly, we choose P, as usual, to be the projector into the null space 
of L,. We write p,(a) for the stationary solution i.e., the solution of 


L,p,(a) = 0 (6.6.72) 


so that p,(a) explicitly depends on x, because L, explicitly depends on x through the 
function f(x). The projector P is defined by 


(PF) (x, «) = p,(a) { da’ F(x, a’) (6.6.73) 


for any function F(x,qa). ‘ 
We now define the function u(x) as 


u(x) = daap,(a) = <@,. (6.6.74) 


Then we define 
Teen lo(fa re (6.6.75) 
: Ox ~ 


Ly = — 2 falx) + bod + > Sa ler (6.6.76) 


so that the term {0/0x} b(x)u(x) cancels when these are added. Thus (6.6.70) is the 
correct FPE corresponding to (6.6.65, 66). 
Now we have 


PL,PF = — p,(a) f da’ é. {b(x)[a’ — u(x)]} p.(a’) [ da" F(x, a”) (6.6.77) 
= 0 
since { ap,(a)da = u(x). 


It is of course true that 
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P= Lip] 0 (6.6.78) 
but 
PLS P (6.6.79) 
We now carry out the normal procedure. Writing, as usual, 


P p(s) = Ws) 


(6.6.80) 
(1 — P)p(s) = ws) 
and assuming w(0) = 0, we find 


0s) = P(Ly + L,)w(s) + PL3i(s) + (0) 
5 W(s) = [y*L, + (1 — P)Ly + (1 — P)L;]0(s) + L,W(s) + (1 — P)L,(s) (6.6.81) 


so that 


5 0(s) = PL;0(s) + P(L, + L;)[s — y°L, — UI — P)L, — U — P)L,J"! 
x [L, + (1 — P)LJi(s) + v(0). (6.6.82) 


To second order we have simply 


The term PL,v(s) is the most important term and yields the deterministic adiabatic 
elimination result. Writing 


u(t) = p(a)p(x) , 


we find 


PLso(t) = pea) { dat |— 2 fa(x) + Bodum] + FZ feCx)'ll px(@P) (6.6.84) 


and since 
{ da p,(a) = 1, 
PL3o(t) = pala) |— 2 fax) + Bua) + FF fe(9'I] 10) (6.6.85) 


so the lowest-order differential equation is 


POD _ — 9 falx) + B(x)uCa)1 A) + > SS (x) (6.6.86) 
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which is equivalent to the stochastic differential equation 


dx = [a(x) + b(x)u(x)]dt + c(x)dW(t) . (6.6.87) 


To this order, the equation of motion contains no fluctuating term whose origin 
is in the equation for a. However, the effect of completely neglecting fluctuations is 
given by (6.6.69) which is very similar to (6.6.87) but has w(x) instead of u(x). While 
it is expected that the average value of @ in the stationary state would be similar to 
u,(x), it is not the same, and the similarity would only be close when the noise term 
B(a@) was small. 


Second-Order Corrections 

It is possible to evaluate (6.6.83) to order y~*. At first glance, the occurrence of 2nd 

derivatives in L,; would seem to indicate that to this order, 4th derivatives occur 

since L; occurs twice. However, we can show that the fourth-order terms vanish. 
Consider the expression 


P(L, + L,)Ly{L2 + (1 — P)L3]0(s) . (6.6.88) 
We know 
1) P0(s) = vs) (6.6.89) 


ii) (1 — P)L,Po(s) = L,P«s) , 


where we have used PL,P = 0. 
Thus, (6.6.88) becomes 


P(L, + L,)Ly'(1 — P)(Lz + L;)P0(s) = P{PL, + [P, L;] + L;P} 
x (1 — P)Ly'(1 — P){L,P + [Ls, P] + PL} Ws) (6.6.90) 


where the commutator [A, B] is defined by 
[A, B] = AB — BA. (6.6.91) 


We have noted that Ly! commutes with (1 — P) and used (1 — P)*? = (1 — P)in 
(6.6.90) to insert another (1 — P) before Ly’. We have also inserted another P 
in front of the whole expression, since P? = P. Using now 


Pd — P)=(1 — P)P=0O, 
(6.6.90) becomes P{PL, + [P, L3]} Ly' (1 — P){L. + [Ls, P]} 0(s) . (6.6.92) 
We will now compute [P, LZ]: 


(PLs fox, a) = ps(a) |— $-fa(x) + b(x)u(2)] 


+S Salcort f fe @) da’ (6.6.93) 
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and 


(LsP)f(x, a) = Jax) + B(x)u(x)] 


a 
- 
+35; eae | pa) f da'f(x, a’). (6.6.94) 


Subtracting these, and defining, 


r(a) = oP PA) Ip(a) (6.6.95) 
ase bata) pies as (6.6.96) 
One finds 


([P, Lal f) @, a) = r(a)fa(x) + bOx)u(x)|Pfx, a) 
— $5,(a)e(x)’P f(x, a) — r(a)P 2 x Lex FO, D]. — (6.6.97) 


The last term can be simplified even further since we are only interested in the case 
where f(x, @) is , i.e., 


fx, a) = p,a) p(x). (6.6.98) 
Then, 

PS o(x)*p.(a) (x) (6.6.99) 

= pla) © ox)? § da'p,(a’)p(2) (6.6.100) 

= p,(a) 2 c(x)*p(x) . (6.6.101) 


We can further show that 


PIP, 1 ,\) = 0: (6.6.102) 
For since 
f da p,{a) = 1, (6.6.103) 


it follows that 


J da r,(a)p,(a) = J das,(a)p,(a) = 0 (6.6. 104) 
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which is sufficient to demonstrate (6.6.102). Hence, instead of (6.6.92) we may write 
(6.6.92) in the form 


PL,L{'{L, + [L3, P]} 0s) (6.6.105) 
and 
[L3, P]o(s) = —p,(a) {r,(a)[a(x) + b(x)u(x)] — 4$5,(a)e(x)?} p(x) 


+ prla)rsfa) © felx)*0(2)] (6.6.106) 


6.6.4 An Example with Arbitrary Nonlinear Coupling 
We consider the pair of equations 


dx = yb(x)adt (6.6. 107) 
da = —y’A(x, a, y)dt + y/2B(x, a, y) W(t) 


and assume the existence of the following limits and asymptotic expansions 


A(x, a, ») ~ O1 Ad(x, ay 
a=0 (6.6.108) 


B(x, a, y) om pa B,(x, a)y” g 


These expansions imply that there is an asymptotic stationary distribution of a 
at fixed x given by 


P.(@, x) = lim p.(@, x, y) (6.6.109) 

p.(@, x) x B(x, a)' exp {f da[Ao(x, a)/Bo(x, a)]} . (6.6.110) 
We assume that A,(x, a) and B,(x, a) are such that 

<a(x)>, = | da apa, x) = 0 (6.6.111) 
so that we deduce from (6.6.108) that, for finite y 

a(x, y)>, = | da ap.(a, x, 7) ~ ao(x)y! (6.6.112) 


where a,(x) can be determined from (6.6.108). 
We define the new variables 


= 1 
B=a—~ ax) (6.6.113) 


xX, =X. 
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In terms of which the Fokker-Planck operator becomes (the Jacobian is a con- 
stant, as usual) on changing x, back to x 


LS < Q(x) b(x) 
— yp 2 w(x) + - '(x)ao(x)b(x) 25 + 2, Bag ()B(x) 
y Ox A \X)Ao ap ap Qo 


[Batt + oar) +B 9(+ snd] 


(6.6.114) 

and, by using the asymptotic expansions (6.6.108), we can write this as 
L=L,+ yL,(y) ye (6.6.115) 

with 
Le Me aP res (6.6.116) 
: ax ° 
Ly = 55 AolB, x) + Fr BoB, 2) (6.6.17) 
Ly) = L, + O17") (6.6.118) 
= = BSH) — FAR aula) + iC) 
0B(B, x) x) 

<a ap wx) + Bil, |. (6.6.119) 


We note that L, and L, do not commute, but, as in Sect. 6.5.4, this does not affect 
the limiting result, 


oP = (L, — PL,Lz'L,)p . (6.6.120) 


The evaluation of the PL,L;'L, term is straightforward, but messy. We note 
that the terms involving 0/08 vanish after being operated on by P. From the explicit 
form of p,(a, x) one can define G(B, x) by 


a hae a a(x) + AB, x) |p. (B, | 


0’ [0B,(f, x) — 
oe oF (eye a&o(x) + Bi(B, x) | PAB, x)= CUB, x)pAB, x) (6.6.121) 


and one finds that 
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PL,L;'L.p = E b(x) D(x) £ B(x) + g b(x)E(x) p (6.6.122) 
with 
D(x) = [ arcB), B(0) |x) (6.6.123) 


E(x) = i dt<B(t), G(B, x)|x) 


where <¢...|x> indicates an average over p,(f, x). This is a rather strong adiabatic 
elimination result, in which an arbitrary nonlinear elimination can be handled and 
a finite resulting noise dealt with. The calculation is simpler than that in the pre- 
vious section, since the terms involving L, are of lower order here than those 
involving L,. 


7. Master Equations and Jump Processes 


It is very often the case that in systems involving numbers of particles, or individual 
objects (animals, bacteria, etc) that a description in terms of a jump process can be 
very plausibly made. In such cases we find, as first mentioned in Sect. 1.1, that in an 
appropriate limit macroscopic deterministic laws of motion arise, about which the 
random nature of the process generates a fluctuating part. However the determinis- 
tic motion and the fluctuations arise directly out of the same description in terms 
of individual jumps, or transitions. In this respect, a description in terms of a jump 
process (and its corresponding master equation) is very satisfactory. 

In contrast, we could model such a system approximately in terms of stochastic 
differential equations, in which the deterministic motion and the fluctuations have a 
completely independent origin. In such a model this independent description of 
fluctuations and deterministic motion is an embarrassment, and fluctuation dissi- 
pation arguments are necessary to obtain some information about the fluctuations. 
In this respect the master equation approach is a much more complete description. 

However the existence of the macroscopic deterministic laws is a very significant 
result, and we will show in this chapter that there 1s often a limit in which the 
solution of a master equation can be approximated asymptotically (in terms of a 
large parameter Q describing the system size) by a deterministic part (which is the 
solution of a deterministic differential equation), plus a fluctuating part, describa- 
ble by a stochastic differential equation, whose coefficients are given by the original 
master equation. Such asymptotic expansions have already been noted in Sect. 
3.8.3, when we dealt with the Poisson process, a very simple jump process, and 
are dealt with in detail in Sect. 7.2. 

The result of these expansions is the development of rather simple rules for 
writing Fokker-Planck equations equivalent (in an asymptotic approximation) to 
master equations, and in fact it is often in practice quite simple to write down the 
appropriate approximate Fokker-Planck equation without ever formulating the 
master equation itself. There are several different ways of formulating the first- 
order approximate Fokker-Planck equation, all of which are equivalent. However, 
there is as yet only one way of systematically expanding in powers of Q7', and that 
is the system size expansion of van Kampen. 

The chapter concludes with an outline of the Poisson representation, a method 
devised by the author and co-workers, which, for a class of master equations, 
sets up a Fokker-Planck equation exactly equivalent to the master equation. In this 
special case, the system size expansion arises aS a small noise expansion of the 
Poisson representation Fokker-Planck equation. 
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7.1 Birth-Death Master Equations— One Variable 


The one dimensional prototype of all birth-death systems consists of a population 
of individuals X in which the number that can occur is called x, which is a non- 
negative integer. We are led to consider the conditional probability P(x, t|x’, t') and 
its corresponding master equation. The concept of birth and death is usually that 
only a finite number of X are created (born) or destroyed (die) in a given event. The 
simplest case is when the X are born or die one at a time, with a time independent 
probability so that the transition probabilities W(x| x’, t) can be written 


W(x |x’, t) = t7(x') 6x, eran HF EO(%) 6 1 - (7.1.1) 
Thus there are two processes, 

x—x+l: t*(x) = transition probability per unit time. (7.1.2) 

x—x—l: t~(x) = transition probability per unit time. (7.1.3) 
The general master equation (3.5.5) then takes the form 


0,P(x, t|x’, t') = tte — DP(« — Itlx, et) 4+ t(« + DP(x +-1, t|x’, v) 
— [t*(x) + t7(xX) P(x, tx’, t’). (7.1.4) 


There are no general methods of golving this equation, except in the time-inde- 
pendent situation. 


7.1.1 Stationary Solutions 


We can write the equation for the stationary solution P,(x) as 


0 = J(x + 1) — J(x) (7.1.5) 
with 
J(x) = t7(x)P,(x) — t*(x — 1I)P,(x — 1). (7.1.6) 


We now take note of the fact that x is a non-negative integer; we cannot have a 
negative number of individuals. This requires 


(i) t-(0) = 0: no probability of an individual dying if there are 
none present; (7.1.7) 
(11) P(x, t|x’, t!) =0 forx <Oorx’ <0. (7.1.8) 


This means that 
J(0) = t-(0)P,(0) — t*(— 1) P,(— 1) = 0. (7.1.9) 


We now sum (7.1.5) so 
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= SVE +) -J@Q)=JO — JO. (7.1.10) 
Hence, 
J(x) = 0 (7.1.11) 
and thus 
— tt(x—)) _ 
P(x) = Gos P(x — 1) (7.1.12) 
so that 


t*(z — 1) 1) 


P(x) = PO) I] (7.1.13) 


t~(z) 


a) Detailed Balance Interpretation 

The condition J(x) = 0 can be viewed as a detailed balance requirement, in which x 
is an even variable. For, it is clear that it is a form of the detailed balance condition 
(5.3.74), which takes the form here of 


P(x, t|x’, O)P,(x’) = P(x’, t| x, O)P,(x) , (7.1.14) 


Setting x‘ = x + 1 and taking the limit t — 0, and noting that by definition 
(3.4.1), 


W(x|x’, t) = lim P(x, t + t|x’, t)/t, (7.1.15) 
t—0 


the necessity of this condition is easily proved. 


b) Rate Equations 
We notice that the mean of x satisfies 


0,<x(t)> =, > xP(x, t| x’, t’) (7.1.16) 
- pa x{tt(x —1)P(x — 1, t] x’, ') — tt) P(x, t| x’, £)] 


+ xd + DPC] + thx) — CPC tx 0 (7.1.17) 
— xt~(x)]P(x, t|x’, t’), (7.1.18) 


(7.1.19) 
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The corresponding deterministic equation is that which would be obtained by 
neglecting fluctuations, i.e., 


a = tt(x) — t“(x). (7.1.20) 


Notice that.a stationary state occurs deterministically when 

t*(x) = t(2). (7.1.21) 
Corresponding to this, notice that the maximum value of P,(x) occurs when 

P(x)/P(x—-D=l, (7.1.22) 
which from (7.1.12) corresponds to 

tt(x — 1) =t (x). (7.1.23) 


Since the variable x takes on only integral values, for sufficiently large x (7.1.21) and 
(7.1.23) are essentially the same. 

Thus, the modal value of x, which corresponds to (7.1.23), is the stationary 
stochastic analogue of the deterministic steady state which corresponds to (7.1.21). 


7.1.2 Example: Chemical Reaction, A= A 


We treat the case of a reaction X= A in which it is assumed that A is a fixed 
2 
concentration. Thus, we assume 
t*(x) = k,a (7.1.24) 
t-(x) = kx (7.1.25) 


so that the Master equation takes the simple form [in which we abbreviate 
P(x, t|x’, t') to P(x, t)] 


0,P(x, t) = k,aP(x —1,t) +k\(x+ IPO + 1,t) — (kx + k2a)P(x, t). (7.1.26) 


a) Generating Function 
To solve the equation, we introduce the generating function (c.f. Sects.1.4.1, 3.8.2) 


G(s, t) = 3° s*P(x, t) (7.1.27) 


so that 


0,G(s, t) = k,a(s — 1)G(s, t) — k,(s — 1)0,G(s, t). (7.1.28) 
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If we substitute 

d(s, t) = G(s, t) exp (—k,as/k,) , (7.1.29) 
(7.1.28) becomes 

0,0(s, t) = — k,(s — 1)0,d(s, 1) . (7.1.30) 
The further substitution s — | = e’, 

As, 1) = w(z, t) 
gives 

0,w(z,t) + k,d,w(z, t) = 0 (7.1.31 
whose solution ts an arbitrary function of k,t — z. For convenience, write this as 


w(z,t) = Flexp (—k,t + z)] e7*29/"1 
= F[(s — Den*] e*201 


G(s, t) =F [(s— 1)e~*"'] exp [(s — 1)k,a/k,] . (7.1... 


Normalisation requires G(1, t) = 1, and hence 


SO 


F(0) =1. (7.1.33) 


b) Conditional Probability 
The initial condition determines F; this is (setting t’ = 0) 


P(x, 0| N, 0) = dy.» (7.1.34) 


which means 


G(s, 0) = s¥ = F(s — 1) exp [(s — 1])k,a/k,] (7.1.35) 
so that 
k,a = 2 
G(s, t) = exp ie (s—I1)(1 —e “| +(s—le*™]*. (7.1.36) 


This can now be expanded in a power series in s giving 


Pex, 11¥ 0) = exp|— TE —e"9| 3 ymacnla) 


x | oA pe le A a ; 


(7.1.37) 
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This very complicated answer is a complete solution to the problem but is of very 
little practical use. It is better to work either directly from (7.1.36), the generating 
function, or from the equations for mean values. 

From the generating function we can compute 


<x(t)> = 0,G(s= 1, th= ~ (1 — e-Fit) + N em Ait (7.1.38) 
(x(t)[x(t) — 1]) = a2G(s = 1, 0 = an _ N en? (7.1.39) 
var {x(t)} = (Nem + 2 4 29) ena, (7.1.40) 


c) Moment Equations 
From the differential equation (7.1.28) we have 


d,[o7G(s, t)] = {n[k,a07~"' — k,o7] + (s — I[k.ad2 — k,07*"}} G(s, t). (7.1.41) 
Setting s = | and using 

O°G(s, t)| sar = <x), (7.1.42) 
we find 


Sf Cxt)>y = mlkxaQeltyYy — ka lt) (7.1.43) 


and these equations form a closed hierarchy. Naturally, the mean and variance solu- 
tions correspond to (7.1.38, 40). 


d) Autocorrelation Function and Stationary Distribution 
As t — oo for any F, we find from (7.1.32, 33) 

G(s, t—» co) = exp[(s — I)k,a/k,] (7.1.44) 
corresponding to the Poissonian solution: 

P(x) = exp (— k,a/k,) (k2a/k,)*/x!. (7.1.45) 


Since the equation of time evolution for ¢x(t)> is linear, we can apply the methods 
of Sect.3.7.4, namely, the regression theorem, which states that the stationary 
autocorrelation function has the same time dependence as the mean, and its value 
at t = Ois the stationary variance. Hence, 


(x(t)>s = k,a/k, (7.1.46) 
var{x(t)}, = k,a/k, (7.1.47) 
(x(t), x(0)>, = ek, a/k, . (7.1.48) 


The Poissonian stationary solution also follows from (7.1.13) by direct substitution. 
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e) Poissonian Time-Dependent Solutions 
A very interesting property of this equation is the existence of Poissonian time- 
dependent solutions. For if we choose 


e- %0a5 


P(x, 0) = xt? (7.1.49) 
then 
G(s, 0) = exp [(s — lao] (7.1.50) 


and from (7.1.32) we find 
G(s, t) = exp {(s — 1)[aoe~*" +(k,a/k,)( —e (7.1.51) 
corresponding to 


e742 (¢) a(t Ne 


P(x, t) = x (7.1.52) 
with 

a(t) = ae7*! + (k,a/k,)(1—e7 *') (7.1.53) 
Here a(t) is seen to be the solution of the deterministic equation 

a = k,a — k,x (7.1.54) 
with the initial condition x(0) = ap. (7.1.55) 


This result can be generalised to many variables and forms the rationale for the 
Poisson representation which will be developed in Sect. 7.7. The existence of Pois- 
sonian propagating solutions is a consequence of the linearity of the system. 


7.1.3 A Chemical Bistable System 


We consider the system 


Ake 2K 3x (7.1.56) 
2 
A==X (7.1.57) 
ka 


which has been studied by many authors [7.1]. The concentration of A is held fixed 
so that we have 
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tt(x) = k,Ax(x — 1) + k,A 


(7.1.58) 
t~(x) = k.x(x — 1) (x — 2) + kyx. 
The corresponding deterministic equation is, of course, 
Hm 14x) — 17) 
(7.1.59) 


= —k,x? +- k, Ax? = k4x = k,A ’ 


where it is assumed that x > 1 so that we set x(x — 1) (x — 2) = x3, etc. The solu- 
tion of this equation, with the initial condition x(0) = x9, is given by 


(2 — xX, a tala (2 — X24 al a (2 — ae 
Xo — X1 Xo — X2 Xq — X3 


= exp [—k,(x, — X2)(X2 — 3x3 — x,)t]. (7.1.60) 


Here, x,, X2, X3 are roots of 
Kax —r k, Ax? +- k4x = x3A = 0 (7.1.61) 
with x; > x, > x. 


Clearly these roots are the stationary values of the solutions x(t) of (7.1.59). 
From (7.1.59) we see that 


. dx 
x<x = 7 7 8 
dx 
es ee ie (7.1.62) 
dx 
SxS xe SS] SO 
dt 
dx 
baa. —=> 7 <0. 


Thus, in the region x < x2, x(t) will be attracted to x, and in the region x > x2, 
x(t) will be attracted to x;. The solution x(t) = x, will be unstable to small pertur- 
bations. This yields a system with two deterministically stable stationary states. 


a) Stochastic Stationary Solution 
From (7.1.13) 


P(x) = P,(0) I ee | (7.1.63) 


where 
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B=k,Al/k, 
P= k,/k, . 


Notice that if P = R, the solution (7.1.63) is Poissonian with mean B. In this case, 
we have a Stationary state in which reactions (7.1.56, 57) are simultaneously in 
balance. This 1s chemical equilibrium, in which, as we will show later, there is always 
a Poissonian solution (Sects. 7.5.1 and 7.7b). The maxima of (7.1.63) occur, 
according to (7.1.21), when 


B= x{(x — 1) (x — 2) + RJ/[P + x(x — 1)]. (7.1.65) 


The function x = x(B), found by inverting (7.1.65), gives the maxima (or minima) 
corresponding to that value of B for a given P and R. 
There are the two asymptotic forms: 


x(B) ~ B large B 


(7.1.66) 

x(B) ~ PB/R_ small B 
If R > 9P, we can show that the slope of x(B) becomes negative for some range 
of x > Oand thus we get three solutions for a given B, as shown in Fig. 7.1. The 
transition from one straight line to the other gives the kink that can be seen. 

Notice also that for the choice of parameters shown, the bimodal shape is signi- 
ficant over a very small range of B. This range is very much narrower than the 
range over which P(x) is two peaked, since the ratio of the heights of the peaks 
can be very high. 


2 000 


X(B) 
1 


Fig. 7.1. Plot of x(B) against B, 
as given by the solution of (7.1.65) 
for various values of R/P, and 
1000 2 000 P = 10,000 
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A more precise result can be given. Suppose the volume V of the system be- 
comes very large and the concentration y of X given by 


y=x/V, 


is constant. Clearly the transition probabilities must scale like V, since the rate of 
production of X will scale like x = yV. 
Hence, 
k,A ae V (7.1.67) 
k, ipod 1/V? 
kG aimed ] 


which means that 


BaSy 
R~ Vv? (7.1.68) 
P~V?. 


We then write 


B= BV % 
R= RV? 
P= Py? 


so that (7.1.65) becomes 
B= y(y? + R)(y? + P). 
And if y, and y, are two values of y, 
log [P.(y2)/P.QW)] = 35, (log BY + log (2? + PY?) 
— log [z(z? + RV?)]} (7.1.69) 
and we now approximate by an integral 


~ Vfarfne EGP 


Hence, 
P,( 2) $o5 f B( a + P) 
Pod = (YL (ore ®), (7.1.70) 
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and as V — oo, depending on the sign of the integral, this ratio becomes either 
zero or infinity. Thus, in a large volume limit, the two peaks, unless precisely equal 
in height, become increasingly unequal and only one survives. 

The variance of the distribution can be obtained by a simple trick. Notice from 
(7.1.63) that P,(x) can be written 


P(x) = B*G(x), (7.1.71) 


where G(x) is a function defined through (7.1.63). Then, 


Oe pa x*B G(x)] b> B*G(x)| | 
and “ (7.1.72) 
BS Cat) = ly — Gah) | 
so that 
, 
var{x} = Bap Cx (7.1.73) 


From this we note that as V — o0, 


< 
pa) 
Lam | 
= 
‘ 
xI- 


ani (7.1.74) 


So a deterministic limit is approached. Further, notice that if <x) is proportional 
to B, the variance is equal to the mean, in fact, we find on the two branches (7.1.66), 


<x(B)> = var {x(B)} = B large B 
<x(B)> = var {x(B)} = PB/R small B 


which means that the distributions are roughly Poissonian on these limiting 
branches. 

The stochastic mean is not, in fact, given exactly by the peak values but ap- 
proximates it very well. Of course, for any B there is one well defined ¢x(B)), not 
three values. Numerical computations show that the mean closely follows the 
lower branch and then suddenly makes a transition at B, to the upper branch. 
This will be the value at which the two maxima have equal height and can, in 
principle, be determined from (7.1.70). 


b) Time-Dependent Behaviour 

This is impossible to deduce exactly. Almost all approximation methods depend 
on the large volume limit, whose properties in a stationary situation have just been 
noted and which will be dealt with systematically in the next section. 
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7.2 Approximation of Master Equations by Fokker-Planck 
Equations 


The existence of the parameter V in terms of which well-defined scaling laws are 
valid leads to the concept of system size expansions, first put on a systematic basis 
by van Kampen [7.2]. There is a confused history attached to this, which arises out of 
repeated attempts to find a limiting form of the Master equation in which a Fokker- 
Planck equation arises. However, the fundamental result 1s that a diffusion proces~ 
can always be approximated by a jump process, not the reverse. 


7.2.1. Jump Process Approximation of a Diffusion Process 


The prototype result is that found for the random walk in Sect.3.8.2, that in the 
limit of infinitely small jump size, the Master equation becomes a Fokker-Planck 
equation. Clearly the jumps must become more probable and smaller, and this can 
be summarised by a scaling assumption: that there 1s a parameter 6, such that the 
average step size and the variance of the step size are proportional to 6, and such 
that the jump probabilities increase as 6 becomes small. 

We assume that the jump probabilities can be written 


x’ — x — A(x)d 


W3(x'|x) = @ ae HOO (7.2.1) 
gy 
where 
f dy Dy, x) =O (7.2.2) 
and 
fdyy Dy, x) =0. (7.2.3) 


This means that 


a(x) = f dx'W4(x'|x) = Q/6d 
a(x) = f dx'(x' — x)W5(x' |x) = A(X)O (7.2.4) 
a(x) = f dx'(x’ — xP W3(x'|x) = f dy y’P(y, x). 


We further assume that ®(y, x) vanishes sufficiently rapidly as y — oo, so that 


3 

in Weio= lim | | _Y ! By, | 0 Yor Xe, (7.2.5) 
6-0 yro | \X —xX 

The conditions (7.2.4, 5) are very similar to those in Sect.3.4, namely, (3.4.1, 4, 5) 

and by taking a twice differentiable function f(z), one can carry out much the same 

procedure as that used in Sect. 3.4 to show that 


lim (ee) = (a,(2) 2 + > a,(z) ) 


implying that in the limit 6 — 0, the Master equation 


OP(x) _ 
ot 


f dx'[W(x|x')P(x') — W(x'|x)P(x)] 


becomes the FPE 


cat eae 2 an(x)P(3) + 5 3 a)P(X) : 


(7.2.6) 


(7.2.7) 


(7.2.8) 


Thus, given (7.2.8), one can always construct a Master equation depending on a 
parameter 6 which approximates it as closely as desired. Such a Master equation 
will have transition probabilities which satisfy the criteria of (7.2.4). If they do not 
satisfy these criteria, then this approximation Is not possible. Some examples are 


appropriate. 


a) Random Walk (Sect.3.8.2) 
Let x = nl, then 


W(x |x’) = A(Oxt x1 -— Oxtx41) : 


Then 
Q(x) = 2d 
a(x) =0 


a(x) = 29d; 


let 
OSL 
and 
D=ld. 


Then all requirements are met, so the limiting equation is 


oP o*P 
var 


as found in Sect.3.8.2. 


b) Poisson Process (Sect.3.8.3) 
Here, letting x = ni, 


W(x |x’) taal AO x, xt 41 


(7.2.9) 


(7.2.10) 


(7.2.11) 


(7.2.12) 


(7.2.13) 


(7.2.14) 
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and 
a(x)=d 
a(x) = Id (7.2.15) 
a(x) = ld. 


There is no way of parametrising / and d in terms of 6 such that /—- 0 and both 
a,(x) and a@,(x) are finite. In this case, there is no Fokker-Planck limit. 


c) General Approximation of Diffusion Process by a Birth-Death Master Equation 
Suppose we have a Master equation such that 


A(x) , Bx) 


A B 
W s(x’ | x) == rs a a5? Ont x45 = ia (- sd + aes Ont x5 (7.2.16) 


so that for sufficiently small 6, W,(x'|x) is positive and we assume that this is 
uniformly possible over the range of x of interest. The process then takes place on a 
range of x composed of integral multiples of 6. This is not of the form of (7.2.1) 
but, nevertheless, in the limit 6 —- 0 gives a FPE. For 


aro(x) = B(x)/6? (7.2.17a) 

a(x) = A(x) (7.2.17) 

a(x) = B(x) : (7.2.17c) 
and 

lim We(x'|x)=0 for x’ # x. (7.2.17d) 


Here, however, a(x) diverges like 1/6*, rather than like 1/6 as in (7.2.4) and the 
picture of a jump taking place according to a smooth distribution is no longer valid. 
The proof carries through, however, since the behaviour of a(x) is irrelevant and 
the limiting FPE ts 


OP(x) _ 
ot 


— é. A(x)P(x) + > Fae B(x)P(x) . (7.2.18) 


In this form, we see that we have a possible tool for simulating a diffusion process 
by an approximating birth-death process. The method fails if B(x) = 0 anywhere 
in the range of x, since this leads to negative W;(x’|x). | Notice that the stationary 
solution of the Master equation in this case is 


— pio) jy [24E = D+ B= A 
P(x) = P,(0) it | —6A(z) + B(z) | 


—§A(0) + a) x E + Orel (7.2.19) 


= P.(0) eerie 1 — 5A(z)/B(z) 


z=0 
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so that, for small enough 6 
log P,(x) > const — log B(x) + 37 26 A(z)/B(2) , (7.2.20) 
s=0 
i.e., 


P, (x) — exp [2 j dz A(z)/B(z)] (7.2.21) 


i) 


as required. The limit is clearly uniform in any finite interval of x provided A(x)/B(x) 
is bounded there. 


7.2.2 The Kramers-Moyal Expansion 


A simple but nonrigorous derivation was given by Kramers [7.3] and considerably 
improved by Moyal [7.4]. It was implicitly used by Linstein [7.5] as explained in 
Sect. 1.2.1. 

In the Master equation (7.2.7), we substitute x’ by defining 


yox—x' in the first term, and 
VS Sx in the second term. 
Defining 
t(y, x) = W(x + ylx), (7.2.22) 


the master equation becomes 


oe) f dy [t(y, x — y)P(x — y) — tYy, x)P(x)] . (7.2.23) 


We now expand in power series, 


= dy 3) rie r= aa lt(y, x)P(x)] (7.2.24) 

ole <1" = [a,(x)P(>)] (7.2.25) 
where 

a(x) = f dx'(x’ — x)"W(x'|x) = f dy y" tly, x). (7.2.26) 


By terminating the series (7.2.25) at the second term, we obtain the Fokker-Planck 
equation (7.2.8). 

In introducing the system size expansion, van Kampen criticised this “proof”, 
because there is no consideration of what small parameter is being considered. 


250 7. Master Equations and Jump Processes 


Nevertheless, this procedure enjoyed wide popularity—mainly because of the 
convenience and simplicity of the result. However, the demonstration in Sect.7.2.1 
shows that there are limits to its validity. Indeed, if we assume that W(x’|x) has 
the form(7.2.1), we find that 


a(x) = 6”?-! f dy y"P(y, x). (7.2.27) 


So that as d—-0, terms higher than the second in the expansion (7.2.25) (the 
Kramers-Moyal expansion) do vanish. And indeed in his presentation, Moyal 
[7.4] did require conditions equivalent to (7.2.4, 5). 


7.2.3. Van Kampen’s System Size Expansion [7.2] 


Birth-death master equations provide good examples of cases where the Kramers- 
Moyal expansion fails, the simplest being the Poisson process mentioned in Sect. 
o-5 0 

In all of these, the size of the jump is +1 or some small integer, whereas typical 
sizes of the variable may be large, e.g., the number of molecules or the position 
of the random walker on a long lattice. 

In such cases, we can introduce a system size parameter 2 such that the transi- 
tion probabilities can be written in terms of the intensive variables x/Q etc. For 
example, in the reaction of Sect.7.1.3, Q was the volume V and x/Q the concentra- 
tion. Let us use van Kampen’s notation: 


a = extensive variable (number ‘of molecules, etc « 9) 

x = a/Q intensive variable (concentration of molecules). 

The limit of interest is large Q at fixed x. This corresponds to the approach to a 
macroscopic system. We can rewrite the transition probability as 

W(a|a’) = W(a’'; Aa) 

Aa=a—d’. (7.2.28) 
The essential point is that the size of the jump is expressed in terms of the extensive 
quantity Aa, but the dependence on a’ is better expressed in terms of the intensive 
variable x. 

Thus, we assume that we can write 
a’ 


W(a’'; Aa) = Qy (G3 Aa). (7.2.29) 


If this is the case, we can now make an expansion. We choose a new variable z so 
that 
a = Od(t) + Q'7z, (7.2.30) 


where g(t) is a function to be determined. It will now be the case that the a@,(a) are 
proportional to Q: we will write 
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a,(a) = 24,(x) . (7.2.31) 


We now take the Kramers-Moyal expansion (7.2.25) and change the variable to 
get 


oP(z, t) 


ie — Quig XG: t) 


[-s] Q!-al2 
a n! 


e (- ¢) "a,ld(t) + Q-'z]P(z, t). (7.2.32) 


The terms of order Q'’? on either side will cancel if d(t) obeys 
d(t)=&[d(t)] (7.2.33) 


which is the deterministic equation expected. We expand @,[¢(t) + (Q7'/?z] in powers 
of Q7!/?, rearrange and find 


aPC.) a. 5, ee am 6 (t) (—; 2)’ 2"-"P(z, t). (7.2.34) 


m! Sa ae 7 n)! 


Taking the large Q limit, only the m = 2 term survives giving 


oP(z, t) 


at = — algo) 2 PE, rare: EAC) SP, t). (7.2.35) 


a) Comparison with Kramers-Moyal Result 


The Kramers-Moyal Fokker-Planck equation, obtained by terminating (7.2.25) 
after two terms, Is 


2D) = — S taxa Pa] + + lalaP(a)} (7.2.36) 


and changing variables to x = a/Q, we get 


MO) = — SF fa PO)) t+ 55 So lea POO. (7.2.37) 


We can now use the small noise theory of Sect. 6.3, with 
pres 


L 
4 (7.2.38) 


and we find that substituting 
z= Ox — d(t)], (7.2.39) 


the lowest-order FPE for z is exactly the same as the lowest-order term in van 
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Kampen’s method (7.2.35). This means that if we are only interested in the lowest 
order, we may use the Kramers-Moyal Fokker-Planck equation which may be 
easier to handle than van Kampen’s method. The results will differ, but to lowest 
order in Q7'/? will agree, and each will only be valid to this order. 

Thus, if a FPE has been obtained from a Master equation, its validity depends 
on the kind of limiting process used to derive it. If it has been derived in a limit 
6 — 0 of the kind used in Sect.7.2.1, then it can be taken seriously and the full 
nonlinear dependence of a@,(a) and a@,(a) on a can be exploited. 

On the other hand, if it arises as the result of an 2 expansion like that in Sect. 
7.2.3, only the small noise approximation has any validity. There is no point in 
considering anything more than the linearisation, (7.2.35), about the deterministic 
solution. The solution of this equation is given in terms of the corresponding 
stochastic differential equation 


dz = @[d(t)\z dt + / ad] aWit) . (7.2.40) 


by the results of Sect. 4.4.7 (4.4.69), or Sect. 4.4.9 (4.4.83). 


b) Example: Chemical Reaction X —— A 
From Sect. 7.1.2, we have 


W(x| x’) = Ox. x141K28 +- Ox. xt KX" ° (7.2.41) 
The assumption is 


a=aV 
(7.2.42) 


YS GY 


where V is the volume of the system. This means that we assume the total amounts 
of A and X to be proportional to V (a reasonable assumption) and that the rates of 
production and decay of X are proportional to a and x, respectively. 

Thus, 


W(xo3 Ax) = V[k 2542.1 + k1X004x5-1] (7.2.43) 


which is in the form of (7.2.29), with Q — V, a — x, etc. 
Thus, 


a(x) = 3X’ — x)WO’' |x) = k.a — kx = V(k2ag — k,Xq) 
a(x) = V(x’ — x)?W(x' |x) = kpa + kx = V(kxady + kX) . 


(7.2.44) 


The deterministic equation 1s 
b(t) = [k2ag — k, d(t)] (7.2.45) 


whose solutions is 


b(t) = 4(0)e-*t + (1 —e*)., (7.2.46) 
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The Fokker-Planck equation is 


PPO) — 4, & P(e) + 5 So thaas + kxgONPC). (7.2.47) 
From (4.4.84, 85) we can compute that 
(z(t) = 2(O)e"*". (7.2.48) 


Usually, one would assume z(0) = 0, since the initial condition can be fully dealt 
with by the initial condition on ¢. Assuming z(0) is zero, we find 


var (2(0} = [2 + gover | =e (7.2.49) 
so that 

<x(t)) = Vole) = VolOerts + FAC — ek) (7.2.50) 

var {x(t)} = V var {z(t)} = e+ vs(0)e-*"|(1 — efit). (7.2.51) 


With the identificationVg(0) = N, these are exactly the same as the exact solutions 
(7.1.38-40) in Sect. 7.1.2. The stationary solution of (7.2.47) is 


P(z) = Wex (— aed (7.2.52) 
Are oes eM Olea = 
which is Gaussian approximation to the exact Poissonian. 
The stationary solution of the Kramers-Moyal equation is 
P(x Hi) ay / 
() = Sere | fo 
= peril x) nee 2h. (7.2.53) 
In fact, one can explicitly check the limit by setting 
x = V(k,a,/k,) + 6 (7.2.54) 
so that 
(7.2.53) = WM (2V kag + k,5)~ 1 t4VE220/"1g— 2 Ve200/k1— 26 (7.2.55) 
Then, 
l = = 2 — 6’). aps 
og P,(x) = const Ika,v © 6?) (7.2.56) 
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Using the exact Poissonian solution, making the same substitution and using 
Stirling’s formula 


log x! ~ (x + 4) log x — x + const, (7.2.57) 


one finds the same result as (7.2.56), but the exact results are different, in the sense 
that even the ratio of the logarithms ts different. 

The term linear in od is, in fact, of lower order in V: because using (7.2.39), we 
find 6 = z\/V and 


log P,(z) ~ const — a ( FF = z' (7.2.58) 


so that in the large V limit, we have a simple Gaussian with zero mean. 


c) Moment Hierarchy 
From the expansion (7.2.34), we can develop equations for the moments 


=| dz PZ,.0)2" (7.2.59) 
by direct substitution and integration by parts: 


00 —(m—-2)/2 m,k Ik! 
at = 2 : “= Te en eT Em B(t)]<z™*k-2") (7.2.60) 


One can develop a hierarchy by expanding <z*» in inverse powers of Q!/?: 


oo 


= MO, (7.2.61) 
r=0 
From such a hierarchy one can compute stationary moments and autocorrelation 
functions using the same techniques as those used in handling the moment hierar- 
chy for the small noise expansion of the Fokker-Planck equation in Sect.6.3.1. 
Van Kampen [7.2] has carried this out. 


7.2.4 Kurtz’s Theorem 


Kurtz [7.6] has demonstrated that in a certain sense, the Kramers-Moyal expansion 
can give rise to a Slightly stronger result than van Kampen’s expansion. For the 
restricted class of birth-death processes with polynomial transition probabilities, 
he has shown the following. We consider the stochastic process obeying a birth- 
death master equation 


a,P(a, t) = >) Wala’)P(a’, t) — 3) Wea’ |a)PCa, t) (7.2.62) 


in which the scaling condition (7.2.29) is satisfied. 
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Then the process O(t), satisfying the stochastic differential equation 
db(t) = a,(b)dt + ./a,(b) dW(t) (7.2.63) 


exists, and to each sample path a(t) of (7.2.62) a sample path of b(t) of (7.2.63) 
exists such that 


|b(t) — a(t)| ~ log V (7.2.64) 


for all finite t. 


This result implies the lowest order result of van Kampen. For, we make the 
the substitution of the form (7.2.30) 


a(t) = Vd(t) + y/25(4) (7.2.65) 
b(t) = Vet) + Vi y(t). (7.2.66) 


Then the characteristic function of z(t) is 


<exp [isz(t)]> = <exp [isV~'/*a(t) — isV'/*6(t)]> 
= exp [— isV'/*6(t)]<exp [isV—"/7b(t)]) + O(V~"? log V) 
= <exp [isy(t)]) + O(V~'” log V). (7.2.67) 


Using now the asymptotic expansion for the FPE we know the distribution function 
of y(t) approaches that of the FPE (7.2.35) to O(V~"!”) and the result follows with, 
however, a slightly weaker convergence because of the log V term involved. Thus, 
in terms of quantities which can be calculated and measured, means, variances, 
etc, Kurtz’s apparently stronger result is equivalent to van Kampen’s system size 
expansion. 


7.2.5 Critical Fluctuations 


The existence of a system size expansion as outlined in Sect.7.2.3 depends on the 
fact that &;(a) does not vanish. It is possible, however, for situations to arise where 


a,(¢,.) = 0, (7.2.68) 


where ¢, is a stationary solution of the deterministic equation. This occurs, for 
example, when we consider the reaction of Sect.7.1.3 for which (using the notation 
of that section) 


&(y) = (By? ip ys yR)k, ) 
where k, = V*k,. 
Two situations can occur, corresponding to A and B in Fig. 7.2. The situation 


A corresponds to an unstable stationary state—any perturbation to the left will 


eventually lead to C, but B is stable. Clearly the deterministic equation takes the 
form 
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Fig. 7.2. Graph showing different kinds of 
behaviour of a@,(y) which lead to a}(y) =0 


y = —k(y — ¢,)° (7.2.69) 


and we have a Master equation analogue of the cubic process of Sect.6.2.4a. 
Van Kampen [7.13] has shown that in this case we should write 


a= Q4(t) + QHu (7.2.70) 


in which case (7.2.32) becomes 


aP(z, = aP(z,t) © Q'-ha a\"_ sti 
: 2 — grag : = 3 n! far Ul P(t) + 2" Z1P(2,1). (7 9 74) 


Suppose now that the first gq — 1 derivatives of &(¢,) vanish. Then if we choose ¢, 
for g(t), (7.2.71) becomes to lowest order 


OP(z,t) 1 aw ep asune” yea 
ap a (¢,)Q Pe P) 


2 
+- $ar(g,)Q'-# SE + higher terms. (7.2.72) 
To make sure z remains of order unity, set 


(i —g) —p =(1 — 2p), ie, w= ares (7.2.73) 


so the result is 


oP l 0 o’P 

— — O-(9-1)/@+1) [_ _ A(@) — 99 ie 

5 = 2 ( a? 522 + bes, (7.2.74) 
(where & and @, are evaluated at ¢,.) The fluctuations now vary on a slower time 
scale t given by 
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t= 1Q79-DMG+D) (7.2.75) 


and the equation for the average Is 


a2 — aa? «x7» (7.2.76) 


which is no longer that associated with the linearised deterministic equation. Of 
course, stability depends on the sign of &{? and whether q is odd or even. The sim- 
plest stable case occurs for g = 3 which occurs at the critical point B of Fig.7.2, 
and in this case we have the cubic process of Sect.6.3.4a. The long-time scale is 


t= Quy, g (7.2.77) 


We see that for large Q, the system’s time dependence is given as a function of 
t= Q7''*t, Only for times t > Q'/? does t become a significant size, and thus it 
is only for very long times ¢ that any significant time development of the system 
takes place. Thus, the motion of the system becomes very s/ow at large Q. 

The condition (7.2.68) is normally controllable by some external parameter, 
(say, for example, the temperature), and the point in the parameter space where 
(7.2.68) is satisfied is called a critical point. This property of very slow time 
development at a critical point is known as critical slowing down. 


7.3. Boundary Conditions for Birth-Death Processes 


For birth-death processes, we have a rather simple way of implementing boundary 
conditions. For a process confined within an interval [a, 5], it is clear that reflecting 
‘and absorbing boundary conditions are obtained by forbidding the exit from the 
interval or the return to it, respectively. Namely, 


Reflecting Absorbing 
Boundary at a t-(a) = 0 tt(a—1)=0 (7.3.1) 
Boundary at 5 t*(b) = 0 t(b6+ 1) =0. 


It is sometimes useful, however, to insert boundaries in a process and, rather than 
set certain transition probabilities equal to zero, impose boundary conditions 
similar to those used for Fokker-Planck equations (Sect.5.2.1) so that the resulting 
solution in the interval [a, 5] is a solution of the Master equation with the ap- 
propriate vanishing transition probabilities. This may be desired in order to pre- 
serve the particular analytic form of the transition probabilities, which may have a 
certain convenience. 


a) Forward Master Equation 
We can write the forward Master equation as 
0,P(x, t|x’,t) = tte — I)Pw—- 1th t+ 0@t+ DP} +1 t}x’, 0’) 
— [tt(x) + t-(x)]P(X, t|x’, t’). (7.3.2) 
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Suppose we want a reflecting barrier at x = a. Then this could be obtained by re- 
quiring 


t-(a) = 0 
and 
P(a— 1, t|x’,t) =0. (7.3.3) 


The only equation affected by this requirement is that for 0,P(a, t| x’, t’) for which 
the same equation can be obtained by not setting ¢-(a) = 0 but instead introducing 
a fictitious P(a — 1, t|x’, t') such that 


t*(a — 1)P(a — 1, t|x’, t') = t-(@P(a, t|x’, t’). (7.3.4) 


This can be viewed as the analogue of the zero current requirement for a reflecting 
barrier in a Fokker-Planck equation. 
If we want an absorbing barrier at x = a, we can set 


tt(a—1l)=0. (7.3.5) 


After reaching the point a — 1, the process never returns and its behaviour is now 
of no interest. The only equation affected by this is that for 0,P(a, t| x’, t') and the 
same equation can be again obtained by introducing a fictitious P(a — 1, t|x’, t’) 
such that ‘ 


Pa—1,t|x’,t) =0. (7.3.6) 


Summarising, we have the alternative formulation of imposed boundary condi- 
tions which yield the same effect in [a, 5] as (7.3.1): 


Foward Master Equation on interval [a, 5] 


Reflecting Absorbing 
(7.3.7) 
Boundary at a | &-(a)P(a) = t*(a— 1)P(a— 1) P(ta—1)=0 
Boundary at 6 | ¢*(6)P(b) = t-(6 4+ IP(6+ 1) Po+1)=0 
b) Backward Master Equation 
The backward Master equation is (see Sect. 3.6) 
0, P(x, t|x', t') = t*Q’)IP(x, t|x’ + 1, t'!) — P(x, t|x’, t’)] 
+ t-(x’)f[ P(x, t|x’ — 1, t!) — P(x, t|x’, t]. (7.3.8) 


In the case of a reflecting barrier, at x = a, it is clear that t-(a) = 0 is equivalent to 
constructing a fictitious P(x, t|a — 1, t’) such that 


Ply tla — 1 #\ — Ply tla #N (7 2O0V 
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In the absorbing barrier case, none of the equations for P(x, t|x’, t’) with 
x,x’ & [a, b] involve t*(a — 1). However, because t*(a — 1) = 0, the equations in 
which x’ < a — 1 will clearly preserve the condition 
P(x, t|x’, t') = 0, x © [a, 5], x <a-—] (7.3.10) 
and the effect of this on the equation with x’ = a will be to impose 
P(x, tla—1,t'))}=0 (7.3.11) 
which is therefore the required boundary condition. Summarising: 


Backward Master Equation on interval [a, 5] 


Reflecting Absorbing 
(7.3.12) 


Boundary at a P(-la — 1) = P(-|@) P(-|a—1)=0 
Boundary at b P(-|6 + 1) = P(- |) P(-|b+ 1)=0 


7.4 Mean First Passage Times 


The method for calculating these in the simple one-step case parallels that of the 
Fokker-Planck equation (Sect.5.2.7) very closely. We assume the system is confined 
to the range 


a<x<b (7.4.1) 


and is abosrbed or reflected at either end, as the case may be. For definiteness we 
take a system with 


reflecting barrier at x= 4a; 


absorbing state at x=b+1. 


The argument is essentially the same as that in Sect.5.2.7 and we find that T(x), 
the mean time for a particle initially at x to be absorbed, satisfies the equation 
related to the backward Master equation (7.3.8): 


t*(x\[Tx + 1) — TX) + CTO — 1) -— TO) = - 1 (7.4.2) 
with the boundary condition corresponding to (5.2.159) and arising from (7.3.12): 
T(a — 1) = T(a) (7.4.3a) 


T(b + 1)=0. (7.4.3b) 
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Define 
U(x) = T(x + 1) — T(x) (7.4.4) 


so (7.4.2) becomes 


tt(x)U(x) — t( U(x — I= 1. (7.4.5) 
Define 
x t~(z) 
G(x) = J 1*(z) and (7.4.6) 
S(x) = U(x)/d(x) (7.4.7) 


then (7.4.5) 1s equivalent to 
t*(x)d(x)[S(x) — Sx — DJ = -1 (7.4.8) 


with a solution 


S(x) = — ¥ le @g@)1 (7.4.9) 
This satisfies the boundary conditios (7.4.3a) which implies that 

U(a—1)=S(a—1)=0. (7.4.10) 
Hence, 

Tx + 1) — Te) = —6@) DI @6@)] (7.4.11) 
and 


a_ reflecting 


T(x) = 40) | lH 4(2)] b absorbing (7.4.12) 
is b>a 


which also satisfies the boundary condition 7(6 + 1) = 0, (7.4.3b). 
Similarly, if a@ is absorbing and 5b reflecting 


a absorbing 


T(x) = 3d) St 42] b reflecting (7.4.13) 
oe id b>a 


and a formula corresponding to (5.2.158) for both a and b absorbing can be 
similarly deduced. 
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7.4.1 Probability of Absorption 


The mean time to absorption is always finite when a and 5b are finite. If, however, 
b is at infinity and is reflecting, the mean time may diverge. This does not itself 
mean that there is a finite probability of not being absorbed. The precise result [Ref. 
7.7, Sect.4.7] is the following. 

If the process takes place on the interval (a, oo) and a is absorbing, then the 
probability of absorption into state a — | from state x is given as follows. Define 
the function M(x) by 


M(x) = x bit a (7.4.14) 


Then if M(x) < o, the probability of absorption at a — 1, from state x, 1s 


M(x) 


1+ M(x) (7.4. l 5) 


and if M(x) = oo, this probability is one. If this probability is 1, then the mean 
time to absorption is (7.4.13). 


7.4.2 Comparison with Fokker-Planck Equation 


The formulae (7.4.12, 13) are really very similar to the corresponding formulae 
(7.4.1, 2) for a diffusion process. In fact, using the model of Sect. 7.2.1c it is not 
difficult to show that in the limit 6d —~ 0 the two become the same. 

If we wish to deal with the kind of problem related to escape over a potential 
barrier (Sect.5.2.7c) which turn up in the context of this kind of master equation, 
for example, in the bistable reaction discussed in Sect.7.1.3, very similar approxima- 
tions can be made. In this example, let us consider the mean first passage time from 
the stable stationary state x, to the other stable stationary state x3. 

Then the point x = 01s a reflecting barrier, so the interval under consideration 
is (0, x3) with initial point x,. Notice that 


= f(z) __ P,(0)t*(0) 


fON u t*(z) P(x) t*(x) (7.4.16) 
so that 
Ton =x) = SPOON DPQ). (7.4.17) 


If we assume that P,(y)~! has a sharp maximum at the unstable point x,, we can 
set y = x, In all other factors in (7.4.17) to obtain 


T(x; 9 %3) ~ es 3 [P.(y)]"’ , (7.4.18) 
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x2 
ny = 3° P,(z) (7.4.19) 
z=0 


and is the total probability of being in the lower peak of the stationary distribution. 
The result is a discrete analogue of those obtained in Sect. 5.2.7c. 


7.5 Birth-Death Systems with Many Variables 


There is a very wide class of systems whose time development can be considered as 
the result of individual encounters between members of some population. These 
include, for example, 
—chemical reactions, which arise by transformations of molecules on collision; 
—population systems, which die, give birth, mate and consume each other; 
—systems of epidemics, in which diseases are transmitted from individual to 
individual by contact. 
All of these can usually be modelled by what I call combinatorial kinetics, in which 
the transition probability for a certain transformation consequent on that en- 
counter is proportional to the number of possible encounters of that type. 
For example, in a chemical reaction X¥ = 2Y, the reaction X¥ — 2Y occurs by 
Spontaneous decay, a degenerate kind of encounter, involving only one individual. 
The number of encounters of this kind 1s the number of X; hence, we say 


t((x—x—lyoryt+dQd=khy. (7.5.1) 


For the reverse reaction, one can assemble pairs of molecules of Y in y(y — 1)/2 
different ways. Hence 


(x—x+ly—y—2)=k.yy — 1). (7.5.2) 


In general, we can consider encounters of many kinds between molecules, species, 
etc., of many kinds. Using the language of chemical reactions, we have the general 
formulation as follows. 

Consider an n-component reacting system involving s different reactions: 


kt 
SINAX, <= MAX, (A= 1,2,....5). (7.5.3) 
a ks a 


A 


The coefficient N4 of X, is the number of molecules of X, involved on the left 
and M4 is the number involved on the right. We introduce a vector notation so that 
if x, is the number of molecules of X,, then 

x Po (x1, X2, 295 Xn) 

N4 = (N14, N4, ..., N4) (7.5.4) 

M4 = (Mi, MZ, .... M4) 


and we also define 
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r4=— M4_N4. (7.5.5) 
Clearly, as reaction A proceeds one Step in the forward direction, 
x—x-+r (7.5.6) 
and in the backward direction, 
x-+x—r’. (7.5.7) 


The rate constants are defined by 


a x,! 
ETE Ge Ha 


xh 
ras) = Ka TM 


(7.5.8) 


which are proportional, respectively, to the number of ways of choosing the com- 
bination N4 or M4 from x molecules. The Master equation 1s thus 


0,P(x,t) = Ddi{[tace + r4)P(x + 14, t) — t7(x) P(x, t)] 


A 


+ [ti(x — r4)P(x — r4,t) — t7(x)P(x, t)]}. (7.5.9) 


This form is, of course, a completely general way of writing a time-homogeneous 
Master equation for an integer variable x in which steps of size r4 can occur. It Is 
only by making the special choice (7.5.8) for the transition probabilities per unit 
time that the general combinatorial Master equation arises. Another name is the 


chemical Master equation, since such equations are particularly adapted to chemical 
reactions. 


7.5.1 Stationary Solutions when Detailed Balance Holds 


In general, there is no explicit way of writing the stationary solution in a practical 
form. However, if detailed balance is satisfied, the stationary solution is easily 
derived. The variable x, being simply a vector of numbers, can only be an even 
variable, hence, detailed balance must take the form (from Sect. 5.3.5) 


ty(x + r4)P(x + rt) = ti(x)P,(x) (7.5.10) 


for all A. The requirement that this holds for all A puts quite stringent requirements 
on the t4. This arises from the fact that (7.5.10) provides a way of calculating 
P(x. + nr) for any n and any initial x,. Using this method for all available A, we 
can generate P,(x) on the space of all x which can be written 


X=2X,+ di n,r' ; (n, integral) (7.5.11) 


but the solutions obtained may be ambiguous since, for example, from (7.5.10) we 
may write 
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P(x + r4) + 7?) = 


P(x) ti(x) t3(x + £4) 
ty(x + r4)t3(x + 74+ 37?) | 


but (7.5.12) 


P(x) t3(x) ta(x + 1?) | 


Pale +) + eV et ot) tla pet 


Using the combinatorial forms (7.5.8) and substituting in (7.5.12), we find that 
this condition is automatically satisfied. 

The condition becomes nontrivial when the same two points can be connected 
to each other in two essentially different ways, 1.e., if, for example, 


N4 + N®? 
M4 + MP (7.5.13) 


but r4=rF=r. 


In this case, uniqueness of P,(x + r“) in (7.5.10) requires 


t3(x) st R(x) 
Bea) Gea ore, 


and this means 


ki _ ks 


ome (7.5.15) 


If there are two chains A, B, C,..., and A’, B’, C’,..., of reactions such that 
r4t pF rl Sr hr PO (7.5.16) 
Direct substitution shows that 
P(x trté+tr® tert...) = Pix tert tr? 4 74+...) (7.5.17) 


only if 


kikiké kiki ké 


kek ke Kev kv kes (7.5.18) 


which is, therefore, the condition for detailed balance in a Master equation with 
combinatorial kinetics. 
A solution for P,(x) in this case is a multivariate Poisson 


P(x) = 1 (7.5.19) 


a ae 
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which we check by substituting into (7.5.10) which gives 


xgtr4).-a,  e- A ae~%a 7 
I “ + - rae gee) ~ i ar ee oe 
Using the fact that 
ra = Mi — Ni, 
we find that 
ki TI aNa = kz I a Me (7.5.21) 


However, the most general solution will have this form only subject to conser- 
vation laws of various kinds. For example, in the reaction 


Ka DY (7.5.22) 
the quantity 2x + y is conserved. Thus, the stationary distribution is 


e"1 ate 92 at 


a a f(2x + y) (7.5.23) 


where ¢ is an arbitrary function. Choosing 4(2x + y) = 1 gives the Poissonian 
solution. Another choice is 


(2x + y) = &(2x + y, N) (7.5.24) 


which corresponds to fixing the total of 2x + y at N. 
As a degenerate form of this, one sometimes considers a reaction written as 


A==2Y (7.5.25) 


in which, however, A is considered a fixed, deterministic number and the possible 
reactions are 


A—2Y: t*t(y)=k,a (7.5.26) 
2Y—-A t(y)=kyy— 1). 


In this case, the conservation law is now simply that y is always even, or always 
odd. The stationary solution is of the form 


P(y) = a v(y, @) (7.5.27) 


where y(y, a) is a function which depends on y only through the evenness or oddness 
of y. 
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7.5.2 Stationary Solutions Without Detailed Balance (Kirchoff’s Solution) 


There is a method which, in principle, determines stationary solutions in general, 
though it does not seem to have found great practical use. The interested reader 
is referred to Haken [Ref. 7.8, Sect.4.8] and Schnakenberg [7.4] for a detailed treat- 
ment. In general, however, approximation methods have more to offer. 


7.5.3 System Size Expansion and Related Expansions 


In general we find that in chemical Master equations a system size expansion does 
exist. The rate of production or absorption is expected to be proportional to Q, the 
size of the system. This means that as Q — oo, we expect 


x~Qp, (7.5.28) 


where p is the set of chemical concentrations. Thus, we must have t#(x) proportional 
to 2. as Q — oo, so that this requires 


(7.5.29) 


Under these circumstances, a multivariate form of van Kampen’s system size expan- 
sion can be developed. This is so complicated that it will not be explicitly derived 
here, but as in the single variable case, we have a Kramers-Moyal expansion 
whose first two terms give a diffusion process whose asymptotic form is the same as 
that arising from a system size expansion. 

The Kramers-Moyal expansion from (7.5.9) can be derived in exactly the same 
way as in Sect.7.2.2, in fact, rather more easily, since (7.5.9) is already in the ap- 
propriate form. Thus, we have 


vy CON HP, D) (7.5.30) 


0,P(x, t) = 2 [t7(x) P(x, t))] eae ames 


and we now truncate this to second order to obtain 


a,P(x, t) = —D OfA(x)P(x M+ 43 00LBo(2)P@, 1, | (7.5.3) 
A(x) = Yi rata) — 14) | 


(7.5.32) 
Bax) = Derérélei(e) + a). 


In this form we have the chemical Fokker-Planck equation corresponding to the 
ee eee ee tian UnAwever we nate that this is really onlv valid as an annroxima- 
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tion, whose large volume asymptotic expansion is identical to order 1/Q with that 
of the corresponding Master equation. 


If one is satisfied with this degree of approximation, it is often simpler to use the 
Fokker-Planck equation than the Master equation. 


7.6 Some Examples 
7.61 X+ A=—2X 


Here, 


Moyea (7.6.1) 
t-(x) = k,x(x — 1). 


Hence, 
A(x) = kyax — k,x(x — 1) ~ kyax — k,x* to order 1/Q (7.6.2) 


B(x) = kyax + kyx(x — 1) ~ kyax + k,x? to order 1/2. 


y k 
1.62 <=> lA 
k y 


Here we have 


i aa r= (1, —1) (7.6.3) 
ty; (x) = kx 
ee mt 20:1); (7.6.4) 
t7(x) = w 
Hence, 
I 0 
A(x) = ( (yy — kx) + i (ka — yy) (7.6.5) 
= a | (7.6.6) 
kx + ka — 2yy 
B(x) = |_| (i, -»} (yy + kx) + I, (0, | (ka + yy) (7.6.7) 


-| ae a | (7.6.8) 
Qa 


—yy—kx yy +kx+k 
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If we now use the linearised form about the stationary state, 
yy = kx = ka (7.6.9) 


2ka —2ka 
B, = | | (7.6.10) 


—2ka 4ka| 
7.6.3 Prey-Predator System 


The prey-predator system of Sect. 1.3 provides a good example of the kind of system 
in which we are interested. As a chemical reaction we can write it as 


i) X+A—2X Pr =(I,0) 
i) X¥+Y¥—+2Y fr? =(—1,1) (7.6.11) 
iii) YB r? = (0, —1). 


The reactions are all irreversible (though reversibility may be introduced) so we 
have 


tz(x)=0 (A=1,2,3) 


but 
Fr 
1 yt! 
ti(x) = ka ein — k,ax 
x! ! 
ti(x) =k, Za io kaxy (7.6.12) 
x! y! 
t3(x) = ks eu = ksy. 


The Master equation can now be explicitly written out using (7.5.9): one obtains 


0,P(x, y) = k,a(x — I)P(x — 1, y) + Ae + DY — DP@ + Ly — DY 
+k3(y + 1) PQ, y + 1) — (kiax + kpxy + kay) PQ, y). (7.6.13) 


There are no exact solutions of this equation, so approximation methods must be 
used. 


Kramers-Moyal. From (7.5.32) 


— 0 
A(x) = i k,ax + : k,xy + | k3y (7.6.14) 


so — id 


7.6.15) 
k xy — k3y 
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| —]| ) 
_ bers + kaxy = — k,xy | | (7.6.17) 
—kyxy k.xy + k3y 


The deterministic equations are 


x k —k 
ral |= | a ”). (7.6.18) 
y k,xy — k3y 


Stationary State at 


xX, k3/k, 
ys k,a/k, 


To determine the stability of this state, we check the stability of the iinearised 
deterministic equation 


d [ox __ 0A(x,) 0A(x,) 
dt | 7 OX, a - Oys oy 
kia — kry, —k,x, 
-| < Vox + | - |e (7.6.20) 
k2y, k,x, — ky 
0 —k,]| [dx 
-| | (7.6.21) 
k,a 0 | Ldy 


The eigenvalues of the matrix are 
A= + i(k,k,a)!” (7.6.22) 


which indicates a periodic motion of any small deviation from the stationary state. 
We thus have neutral stability, since the disturbance neither grows nor decays. 
This is related to the existence of a conserved quantity 


V=k,.(x + y) — k; log x — kia log y (7.6.23) 


which can readily be checked to satisfy dV/dt = 0. Thus, the system conserves V 
and this means that there are different circular trajectories of constant V. 
Writing again 


SX. +. Ox (7.6.24) 
y=ys, + Oy 


and expanding to second order, we see that 
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HEB 


De deep (7.6.25) 
so that the orbits are initially elliptical (this can also be deduced from the linearised 
analysis). 

As the orbits become larger, they become less elliptic and eventually either 
x or y may become zero. 

If x is the first to become zero (all the prey have been eaten), one sees that y 
inevitably proceeds to zero as well. If y becomes zero (all predators have starved to 
death), the prey grow unchecked with exponential growth. 


Stochastic Behaviour. Because of the conservation of the quantity V, the orbits have 
neutral stability which means that when the fluctuations are included, the system 
will tend to change the size of the orbit with time. We can see this directly from the 
equivalent stochastic differential equations 


dx k,ax — k,xy dW,(t) 
| = | |ar + C(x, y) (7.6.26) 
dy k,xy — k3y dW,{t) 
where 
C(x, y)C(x, y)? = B(x). (7.6.27) 
Then using Ito’s formula ‘ 
av vv...» a#V @? 
dV(x, y) = dx +5 ota > (Sa dxt + 255 dx dy + 5a 3 dy) (7.6.28) 
so that 
oV 
av(x, wy = (FE (yax — kaxy) + 5 (kaxy — kay) a (7.6.29) 


k 
+ (Bux ~+ By xa) at 


The first average vanishes since V is deterministically conserved and we find 


Kaka | k3k2y ce k,k,ax i sea ; (7.6.30) 


i 

WME Ae 
All of these terms are of order Q7' and are positive when x and y are positive. 
Thus, in the mean, V(x, y) increases steadily. Of course, eventually one or other of 
the axes is hit and similar effects occur to the deterministic case. We see that 
when x or y vanish, V = oo, 

Direct implementation of the system size expansion is very cumbersome in this 
case, and moment equations prove more useful. These can be derived directly from 
the Master equation or from the Fokker-Planck equation. The results differ slightly 
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from each other, by terms of order inverse volume. For simplicity, we use the FPE 
so that 


#10 = igen — k<xy) (7.6.31) 
y? k2<xy) — k3<yp 
<x? (2x dx + dx*» 
i Ly + <x dy + y dx + dx dy) (7.6.32) 
cy) (2y dy + dy’) 
2k ,aXx*) — 2k,<x*y) + kyatx)> + k2Xxy> 
=|k,ix%y — y2x) + (kia —k; —k,)dxy) |. (7.6.33) 


2k,<xy*) — 2k3¢y*) + knXxy) + ki<y> 


Knowing a system size expansion Is valid means that we know all correlations and 
variances are of order 1/Q compared with the means. 
We therefore write 


x = (x) + bx (7.6.34) 
y = <y) + oy 


and keep terms only of lowest order. Noting that terms arising from <dx?), (dxdy) 
and <dy*» are one order in 2 smaller than the others, we get 


ee _ ne — kXx><y) | " Mornay (7.6.35) 
Pic» k2<x)<y) — kxXx)<y) k,<dxdy) 
4 <dx") k\axx) + k2<x)<y) 
ap | SOxOY? | = | kay? 
Cy") | Lkadxd<y) + beady (7.6.36) 
2k,a — 2k,<y>, —2k,<x> , 0 «dx?» 
+] k.<y » Kia—k3+k{Xx) — <>), —kaXx) <dxdy) 
0 » 2k2<y> , 2k,<x> — 2k, ]| <dy”> 


_We note that the means, to lowest order, obey the deterministic equations, but to 
next order, the term containing <éxdy) will contribute. Thus, let us choose a sim- 
plified case in which 


kKia=k,;=1, k, =a (7.6.37) 
which can always be done by rescaling variables. Also abbreviating 


(x) — X,Y) — Vr Ox") — f, (OxdY) — B, <dy?) +h, (7.6.38) 
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we obtain 
d |x x — axy —ag 
2 ea es 
f x + axy 2—2ay —2ax 0 f 
< Polsliceeeye Vela a(x—y) —ax lg. (7.6.39) 
h axy+y 0 2ay 2ax —2}|h 


We can attempt to solve these equations in a stationary state. Bearing in mind that 
f,g,h, are a factor Q7' smaller than x and y, this requires @ to be of order Q™! [this 
also follows from the scaling requirements (7.5.29)]. Hence a is small. To lowest 
order one has 


5 my a eae (7.6.40) 


But the equations for f, g, 4 in the stationary state then become 


22 2/a 
h—f|=|-—l/e (7.6.41) 
—2g 2/a 


. 
which are inconsistent. Thus this method does not yield a stationary state. Alter- 
natively one can solve all of (7.6.39) in a stationary state. 

After some manipulation one finds 

oe (7.6.42) 


Se aW'(x, = ax?) 
so that 


fp = x(— 2ax? + x, (2 — a) — 1)/ 2 — 2ex,) 
h, = x,(—2ax? + x,(2 + a) + 1)/(2 — 2ex,) 


(7.6.43) 


and the equation for g, gives 
—ax? + ax,(f, — h,) = 0 (7.6.44) 
giving a Solution for x,, y,, etc. 


xX,=ys =} 

f, = ta/(a — 2) 
g, = 4(2 — a)/a 
h, = —1/(a—2) 


(7.6.45) 
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and for small @ in which the method ts valid, this leads to a negative value for f,, 
which is by definition, postive. Thus there is no stationary solution. 

By again approximating x, = y, = l/a, the differential equations for f, g, and h 
can easily be solved. We find, on assuming that initially the system has zero vari- 
ances and correlations, 


Ai= 5_ (cos 2-1) + a 


ke 

g(t) = — sin 21 (7.6.46) 
l ; 2t 

h(t) = — 5, (60S 2t—1)+ re 


Notice that f(t) and A(t) are, in fact, always positive and increase steadily. The solu- 
tion is valid only for a short time since the increasing value of g(t) will eventually 
generate a time-dependent mean. 


7.6.4 Generating Function Equations 


In the case of combinatorial kinetics, a relatively simple differential equation can be 
derived for the generating function: 


G(s, t) = 2 (Ilse *) P(x,?). (7.6.47) 
For we note that 
0,G(s, t) = 07 G(s, t) + 0, G(s, t) (7.6.48) 


where the two terms correspond to the t* and ¢f7 parts of the master equation. Thus 


a+G(s, t) = zal IM eer ray si |P(x =r At) 
“Tes so PCa, )}. (7.6.49) 


Changing the summation variable to x — r4 and renaming this as x, in the first 
term we find 


Xo ! x +r4 Xa ! < 
ie rs a a 
a OG ka TT LI (xX. — NDI UG, — NZ)!" so | Pa EN) 
Note that 
ee om nA 


II Net = i] (ac ays: (7.6.51) 
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and that 
statrax! A A 
ea =I (a. | sf (7.6.52) 
so that 
G(s, 1) = iki (11 oe sn) ar Gs, t). (7.6.53) 


Similarly, we derive a formula for d; G(s, t) and put these together to get 


(7.6.54) 


which is the general formula for a generating function differential equation. We now 
give a few examples. 


a) An Exactly Soluble Model 
Reactions: (A, B, C held fixed) 


AtxX™.2¥+D Nt=1,M'=2;r'= | 


kt =k,A (7.6.55) 
Kan 
k 
B+ X= N?=1,M?=0:r? =—1! 
3 
ki = kB (7.6.56) 
Ke — k3C . 


Hence, from (7.6.54), the generating function equation is 


0,G = (s* — s)(k,A0,G) + (1 — s)(k,Bo,G — k3CG). (7.6.57) 
Solve by characteristics. 

Set 

k,B= B, k,A =a, KC) (7.6.58 


The characteristics are 


at ds dG 
T ~~ @—9@—as) 1—5G oe 
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|; = 
(5 = *] eter Dh = u (7.6.60) 
(B — as)'""G =v. (7.6.61) 


The general solution can be written v = F(u), i.e., 


G=(p— as)-”'* Fe (a— poe (7. (7.6.62) 


— as 


From this we can find various time-dependent solutions. The conditional probabi- 
lity P(x, t| y, 0) comes from the initial condition 


G,(s, 0) = s” (7.6.63) 


> F(z) = (1 — Bz)*C. — az)7"'*»(B — a)’!4 (7.6.64) 


=> G,(s, t) = 2”'*[B(1 — e*) — s(a — Be)” 


(7.6.65) 
x [(B — ae’) — as(1 — eo *)J0" 8 
(with A = B — a). 
As t — oo, a Stationary state exists only if B > @ and is 
G,(s, 00) = (B — as)-"!"(B — a)" (7.6.66) 
=> P(x) = ae arta (B — ay? , (7.6.67) 


We can also derive moment equations from the generating function equations by 
noting 


0,G(S, t)| a1 = <x(t)? 


(7.6.68) 
G(s, t)leer = Cx(EX() — Ip 
Proceeding this way we have 
& Gx(0)) = (ka — ky B)x(1)y + KC (7.6.69) 
and 
& &x()Lx(t) — Uy = 202d — ky B)dx(EX() — UD 
+ 2k,A¢x(t)> + 2k,C<x(t) . (7.6.70) 


These equations have a stable stationary solution provided 
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k,A <k,B, le, a<fP. 

In this case, the stationary mean and variance are 
(x), = k3C/(k,B — kA) (7.6.71) 
var {x}, = k,k,;BC/(k,A — k,B)’. (7.6.72) 


This model is a simple representation of the processes taking place in a nuclear 
reactor. Here X is a neutron. The first reaction represents the fission by absorption 
of a neutron by A to produce residue(s) D plus two neutrons. The second re- 
presents absorption of neutrons and production by means other than fission. 

As k,A approaches k,B, we approach a critical situation where neutrons are ab- 
sorbed and created in almost equal numbers. For k,A > k,B, an explosive chain 
reaction occurs. Notice that <x,> and var {x}, both become very large as a critical 
point is approached and, in fact, 


var{x,} kB 
So ER A CO . (7.6.73) 


Thus, there are very large fluctuations in <x,) near the critical point. 
Note also that the system has linear equations for the mean and is Markovian, 
so the methods of Sect. 3.7.4 (the regression theorem) show that 
¢ 


<x(t), x(0)>, = exp [(k,A — k, B)t]var {x}, (7.6.74) 


so that the fluctuations become vanishingly slow as the critical point is approached, 
l.e., the time correlation function decays very slowly with time. 


k 
b) Chemical Reaction X, ay 5 


One reaction 


kt=k,, kr=k (7.6.75) 
0,G(s,, S25 t) a (Ss — 5:)(k,0,, = k,0,,.)G(s,, S25 t) 


can be solved by characteristics. The generating function is an arbitrary function 
of solutions of 


dt ds, ds, 
Se Ee a etsy 7.6.76 
i EG 6) eG) Mio:30) 


Two integrals are solutions of 


k.ds, + kids. =0 kos, +khis, =v. (7.6.77) 
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(ky + ka)dt = A280 
ae (7.6.78) 
=> (S55 en Se 
“. GS), 52, t) = Flkos, + ky52, (Sp — 5,)e7 *It*24] . (7.6.79) 
The initial condition (Poissonian) 
G(5S;, 52, 0) = exp [a(s, — 1) + B(s. — 1)] (7.6.80) 
gives the Poissonian solution: 
k,B —k 
G(5,, 52, f) = exp 4 aa i, easyer eM 
at Bp = - 
+ Loe [k,(s. — 1) + k,(s, 1)]}. (7.6.81) 


In this case, the stationary solution is not unique because x + y is a conserved 
quantity. From (7.6.79) we see that the general stationary solution is of the form 


G(S;, 52, 00) = F(k25, + k52, 0). (7.6.82) 
Thus, 
0G  ,,0°G 
kr ao k3 ast (7.6.83) 


which implies that, setting s, = s, = I, 


KID ¢ = ACD) s- (7.6.84) 


7.7 The Poisson Representation [7.10] 


This is a particularly elegant technique which generates Fokker-Planck equations 
which are equivalent to chemical Master equations of the form (7.5.9). 


We assume that we can expand P(x, ft) as a superposition of multivariate uncor- 
related Poissons: 


ee %aq*a 
x,! 


P(x, t) = [ da J] f(a, t). (7.7.1) 


This means that the generating function G(s, t) can be written 
G(s, t) = { da exp [Di (sa — lad f(a, t). (7.7.2) 


We substitute this in the generating function equation (7.6.54) to get 
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soon =x sael (g++) 


0a, a 
. (i ae aie exp [1 (54 — Daz fla,t). (7.7.3) 


We now integrate by parts, drop surface terms and finally equate coefficients of the 
exponential to obtain 


AG = BL ~ ae) 


A Qa 


(7.7.4) 


a) Fokker-Planck Equations for Bimolecular Reaction Systems 
This equation is of the Fokker-Planck form if we have, as is usual in real chemical 
reactions, 


2 
a (7.7.5) 
2 


which indicates that only pairs of molecules at the most participate in reactions. 
The FPE can then be simplified “as follows. Define the currents 


nA M4 
J (a) = ki IL a,° —ka Wa’, (7.7.6) 
the drifts 
A{J(a)] = di rel (a) , (7.7.7) 
A 


and the diffusion matrix elements by 


Bal I(a)] = 2, Ske MIMs — NENG — oaora) (7.7.8) 


Then the Poisson representation FPE is 


Of(a,t) 
ot 


- SF (AV@Me, 0) 


+ 4D 55 (BalV@1S(@, 0} _ 


Notice also that if we use the explicit volume dependence of the parameters given 
in Sect 7 5 2 (74599) and define 
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Na = A/V (7.7.10) 
eg Vr (7.7.11) 


and F(, t) is the quasiprobability in the 7 variable, then the FPE for the 7 variable 
takes the form of 


0. E* > a 

Bs, ee: ie 9 ee ae 7.7.12 
2 an, [4,.() F(a, t)] + 5 2 anon, [Ba(m)F(n,t)]| ) 
Ada) = Di rtd An) (7.7.13a) 

a n4 M4 
JAAN) = K4 II no” — 4 II y,° (7.7.13b) 
Bn) = S(m\(M4M$ — N4ANS — 60rd). (7.7.13c) 

A 


In this form we see how the system size expansion in V~'/? corresponds exactly to 
a small noise expansion in y of the FPE (7.7.12). For such birth-death Master 
equations, this method is technically much simpler than a direct system size ex- 
pansion. 


b) Unimolecular Reactions 
If for all A, 


SS Mf<1 and SNS <1, 


then it is easily checked that the diffusion coefficient B,,(7) in (7.7.13) vanishes, 
and we have a Liouville equation. An initially Poissonian P(x, fg), corresponds to a 
delta function F(, t)), and the time evolution generated by this Liouville equation 
will generate a delta function solution, 5(7 — 4(t)), where #(t) is the solution of 


dn/dt = A(n) 


This means that P(x, t) will preserve a Poissonian form, with mean equal to q(t). 
Thus we derive the general result, that there exist propagating multipoissonian 
solutions for any unimolecular reaction system. Non Poissonian solutions also 
exist—these correspond to initial F(9, t)) which are not delta functions. 


c) Example 
As an example, consider the reaction pair 


ko 
i) A+ X—=2¥ 
Ke (7.7.14) 


ee ky 
il) B+ X=——C 
k3 
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N=1, M'=2, kt=kA, kro=k 
N2=1, M?=0, kt =k,B, ky =k,C 


so that (7.7.4) takes the form 


: = (! ~ mal sh ( 7 o |e ep (7.7.15) 
ae c 2 (1 = 2 &Be — k,C)f 
of = |— 2 [ksC + (k,A — k,B)a — k,a*] +  tk,Aa — k,a’}t f (7.7.16) 


which is of the Fokker-Planck form, provided k,Aa — k,a* > 0. Furthermore, 
there is the simple relationship between moments, which takes the form (in the 
case of one variable) 


e "a 
x 


Kx") = Df da lex = .. + DS fa) 


= f daa'f(a) = <a’). 


(7.7.17) 


This follows from the factorial moments of the Poisson distribution (Sect. 2.8.3). 
However, f(a) is not a probability or at least, is not guaranteed to be a probability 
in the simple minded way it is defined. This is clear, since any positive superposi- 
tion of Poisson distributions must have a variance at least as wide as the Poisson 
distribution. Hence any P(x) for which the variance is less than that of the Poisson 
distribution 1.e., cannot be represented by a positive f(a). 

A representation in terms of distributions is always possible, at least formally. 
For if we define 


f(a) = (—1)"8(a)e*, then (7.7.18) 
f da f,(a)e*a*/x! = { daa’ (— a ; d(a)/x! (7.7.19) 


and integrating by parts 
=0d,, (7.7.20) 
which means that we can write 


P(x) = f da(em*ar/x! 1X (— 1)” PLW)B"(a)e" (7.7.21) 


so that in a formal sense, an f(a) can always be found for any P(x). 

The rather singular form just given does not, in fact, normally arise since, for 
example, we can find the stationary solution of the FPE (7.7.16) as the potential 
solution (up to a normalisation) 
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flee) = €%(kyA — keger) 18/4 aC A- VD ggkaC/ka A= 1 (7.7.22) 


which is a relatively smooth function. However, an interpretation as a probability 
is only possible if f(a) is positive or zero and is normalisable. 
If we define 


6 = (k,B/k, — k,C/k,A), (7.7.23) 
then f,(@) is normalisable on the interval (0, k,A/k,) provided that 


6>0 (7.7.24) 
k; > 0. 


Clearly, by definition, k, must be positive. 

It must further be checked that the integrations by parts used to derive the FPE 
(7.7.4) are such that under these conditions, surface terms vanish. For an interval 
(a, b) the surface terms which would arise in the case of the reaction (7.7.14) can be 
written 


[{(kK.Aa@ — kya® — k, Ba + k3C)f — Og[(k2a — Kaa’) f]} {oP} 13 
+ [(kaa — kyo’) f(s — Ie? 35 (7.7.25) 


Because of the extra factor (s — 1) on the second line, each line must vanish separa- 
tely. It is easily checked that on the interval (0, k,A/k,), each term vanishes at 
each end of the interval for the choice (7.7.22) of f, provided 6 and k,; are both 
greater than zero. 

In the case where k; and 6 are both positive, we have a genuine FPE equivalent 
to the stochastic differential equation 


da = [k3C + (k,4A — k,B)a — k,a®ldt + /2kda —ka)dW(t). (7.7.26) 


The motion takes place on the range (0, k,A/k,) and both boundaries satisfy the cri- 
teria for entrance boundaries, which means that it is not possible to leave the 
range (0, k,A/k,) (Sect.5.2.1). 

If either of the conditions (7.7.24) is violated, it is found that the drift vector is 
such as to take the point outside the interval (0, k,A/k,). For example, near a = 0 
we have 


da ~ k,C dt (7.7.27) 


and if k,C is negative, a will proceed to negative values. In this case, the coefficient 
of dW(t) in (7.7.26) becomes imaginary and interpretation is no longer possible 
without further explanation. 

Of course, viewed as a SDE in the complex variable 


a=a,+ia,, (7.7.28) 


the SDE is perfectly sensible and is really a pair of stochastic differential equations 
for the two variables a, and a,. However, the corresponding FPE is no longer the 
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one variable equation (7.7.16) but a two-variable FPE. We can derive such a FPE 
in terms of variations of the Poisson representation, which we now treat. 


7.7.1 Kinds of Poisson Representations 
Let us consider the case of one variable and write 
PY) = ui du(a)(e~%a*/x!)f(a) . (7.7.29) 


Then (a) is a measure which we will show may be chosen in three ways which all 
lead to useful representations, and @ is the domain of integration, which can take 
on various forms, depending on the choice of measure. 


7.7.2 Real Poisson Representations 
Here we choose 
du(a) = da (7.7.30) 


and & is a section of the real line. As noted in the preceding example, this represen- 
tation does not always exist, but where it does, a simple interpretation in terms of 
Fokker-Planck equations is possible. 


7.7.3 Complex Poisson Representations 


Here, 
du(a) = da (7.7.31) 


and @ is a contour C in the complex plane. We can show that this exists under 
certain restrictive conditions. For, instead of the form (7.7.18), we can choose 


Ot Pas anv—e (7.7.32) 


Ti 
and C to be a contour surrounding the origin. This means that 
= l y! x—y-h 

P(x) = ani f ue = Ons (7.7.33) 
By appropriate summation, we may express a given P(x) in terms of an f(a) given by 

fla) = P(yetay 1. (7.7.34) 

2m SF 

If the P(y) are such that for all y, y!P(y) is bounded, the series has a finite radius of 
convergence outside which f(a) is analytic. By choosing C to be outside this circle of 


convergence, we can take the integration inside the summation to find that P(x) 
is given by 


/./— Lhe Poisson Kepresentauon L009 
P(x) = f da(e~%a*/x!) f(a) . (7.7.35) 
C 


a) Example: Reactions (1) 4 + X¥ =—=2X, (2) B+ X=—=C 

We use the notation of Sect. 7.7 and distinguish three cases, depending on the 
magnitude of 6. The quantity 6 gives a measure of the direction in which the reac- 
tion system (7.7.14) is proceeding when a steady state exists. If 6 > 0, we find that 
when x has its steady state value, reaction (1) is producing X while reaction (2) 
consumes X. When 6 = 0, both reactions balance separately—thus we have 
chemical equilibrium. When 6 < 0, reaction (1) consumes X while reaction (2) pro- 
duces X. 

i) 6d > 0: according to (7.7.24), this is the condition for f,(@) to be a valid quasipro- 
bability on the real interval (0, k,A/k,). In this range, the diffusion coefficient 
(k,Aa — k,a’) is positive. The deterministic mean of a, given by 


= a 2 1/2 
a= k,A k,B as [(k,A kB) ae 4k3k,C] (7.7.36) 
2k, 
lies within the interval (0, k,A/k,). We are therefore dealing with the case of a genu- 
ine FPE and f,(@) is a function vanishing at both ends of the interval and peaked 
near the deterministic steady state. 


li) 06 = 0: since both reactions now balance separately, we expect a Poissonian 
steady state. We note that f,(a) in this case has a pole at a = k,A/k, and we choose 
the range of @ to be a contour in the complex plane enclosing this pole. Since this 
is a closed contour, there are no boundary terms arising from partial integration 
and P,(x) given by choosing this type of Poisson representation clearly satisfies 
the steady state Master equation. Now using the calculus of residues, we see that 


P(x) = °—% (7.7.37) 
x: 
with 
Qo — k,A/K, ‘s 


i11) 0 < 0: when 6 < 0 we meet some very interesting features. The steady state 
solution (7.7.22) now no longer satisfies the condition 6 > 0. However, if the range 
of @ is chosen to be a contour C in the complex plane (Fig. 7.3) and we employ 
the complex Poisson representation, P,(x) constructed as 


Pix) = [da fa) — (7.7.38) 


is a solution of the Master equation. The deterministic steady state now occurs 
at a point on the real axis to the right of the singularity at a = k,A/k,, and asymp- 
totic evaluations of means, moments, etc., may be obtained by choosing C to pass 
through the saddle point that occurs there. In doing so, one finds that the variance 
of a, defined as 
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Fig. 7.3. Contour C in the complex plane 
for the evaluation of (7.7.38) 


var {a} = <a’) — Ca)’, (7.7.39) 
is negative, so that 
var {x} = (x?) — (x)? = <a) — (a)? + @ < dO (7.7.40) 


This means that the steady state is narrower than the Poissonian. Finally, 
it should be noted that all three cases can be obtained from the contour C. In 
the case where 6 = 0, the cut from the singularity at a = k,A/k, to —© va- 
nishes and C may be distorted to a simple contour round the pole, while if 6 > 0, 
the singularity at a = k,A/k, is now integrable so the contour may be collapsed 
onto the cut and the integral evaluated as a discontinuity integral over the range 
[0, k,A/k,]. (When 6 is a positive integer, this argument requires modification). 


b) Example: Reactions B ly , 2K tg 
For which the Fokker-Planck equation 1s 


— —~ - [eV — 2x,V-a"*) fla, t)] — i [(x2V-'a") f(a, t)], (7.7.41) 


where x,V = k,B, x.V~' = k, and V is the system volume. Note that the diffusion 
coefficient in the above FPE is negative on all the real lines. 
The potential solution of (7.7.41) is (up to a normalisation factor) 


K(a) = a exp (2a + aV?/a) (7.7.42) 


with a = 2x,/x, and the a integration is to be performed along a closed contour 
encircling the origin. Of course, in principle, there is another solution obtained by 
solving the stationary FPE in full. However, only the potential solution is single 
valued and allows us to choose an acceptable contour on which partial integration 
is permitted. 

Thus, by putting a = nV, we get 


y' § dn eV AntalM y'—2 


Ds = § dn eV 2ntalm y—2 (7.7.43) 
The function (27 + a/y) does not have a maximum at the deterministic steady 


state. In fact, it has a minimum at the deterministic steady state 7 = + (a/2)'/. 


/./ Ne POISSON Nepicsciutauuu ae 


However, in the complex 7 plane this point is a saddle point and provides the 
dominant contribution to the integral. 

Thus, the negative diffusion coefficient in (7.7.41) reflects itself by giving rise to a 
saddle point at the deterministic steady state, which results in the variance in X 
being less than <x). 

From (7.7.43) all the steady states moments can be calculated exactly. The 
results are 


co (rb) ap oa 


where I,(2(2a)'/*V) are the modified Bessel functions. Using the large-argument 
expansion for I,(2(2a)'’V), we get 


(x) = V(a/2)'" + § + O(1/V) (7.7.45) 


var {x} = 3 V(a/2)'’? —4 + OU/V). 


These asymptotic results can also be obtained by directly applying the method of 
steepest descents to (7.7.43). In general, this kind of expansion will always be pos- 
sible after explicitly exhibiting the volume dependence of the parameters. 


c) Summary of Advantages 

The complex Poisson representation yields stationary solutions in analytic form to 
which asymptotic or exact methods are easily applicable. It is not so useful in the 
case of time-dependent solutions. The greatest advantages, however, occur in 
quantum mechanical systems where similar techniques can be used for complex P 
representations which can give information that is not otherwise extractable. These 
are treated in Chap. 10. 


7.7.4 The Positive Poisson Representation 
Here we choose a to be a complex variable a, + ia,, 


du(a) = d’a = da,da, , (7.7.46) 


and @ is the whole complex plane. We show in Sect. 10.6.3 that for any P(x), a 
Positive f(a) exists such that 


P(x) = f d?a (e%a"/x!) f(a) ; (7.7.47) 


thus, the positive P representation always exists. It is not unique, however. For 
example, choose 


f(a) = (20)! exp (— |@ — a|?/207) (7.7.48) 


and note that if g(a) is any analytic function of a, we can write 


(a) = g(ao) + 23 8 (aa)a — a)"/n! (7.7.49) 
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so that 
f (2n0?)"'d?a exp (— |@ — ao|?/207)g(a) = g(ao) , (7.7.50) 


since the terms with nm > 1 vanish when integrated in (7.7.50). Noting that the 
Poisson form e~%a*/x! is itself analytic in a, we obtain for any positive value of a 


P(x) = [ @af,(ae%a"/x! = e705 /x!. (7.7.51) 


In practice, this nonuniqueness is an advantage rather than a problem. 


a) Fokker-Planck Equations 

We make use of the analyticity of the Poisson and its generating function to produce 
Fokker-Planck equations with positive diffusion matrices. A FPE of the form of 
(7.7.9) arises from a generating function equation 


0,G(s, t) = [da f(a, t) (=: A, - + 2 4 Bagge exp [2isa— l)a,]. (7.7.52) 


We now take explicit account of the fact that @ is a complex variable 
a=a,+ lia, (7.7.53) 


and also write 


A(a) = A,(@) + iA,(a@) (7.7.54) 
We further write s 

B(a) = C(a@)C"(a@) (7.7.55) 
and 

C(a) = C.(a) + iC,(@) (7.7.56) 


For brevity we use 


0 
= Aa. 
x 0 | 
0. = ap C757) 
0 
Co Oay a 


Because of the analyticity of exp [}\(s, — 1)a,], in the generating function equation 


(7.7.52) we can always make the interchangeable choice 
0, 0%  — id”. (7.7.58) 


We then substitute the form (7.7.54) for B,,, and replace 0, by either 0% or —id” 


‘e/ BRS BUIDDUYE ANG pd Cows reeewas ees 


according to whether the corresponding index on A or Cis x or y respectively. We 
then derive 


0,G(s,t) = J d?a f(@, t){[Yi (Aa;x03 + Aa;y9%) 
ar 4 >, (Ca, c.xCo,;x0205 a6 City Cox 0,0) 
a,b,c 
+ 2Cy.erCe,0;y9308] exp [5 (5a — Daw]}- (7.7.59) 
Integrating by parts and discarding the surface terms to get a FPE in the variables 
(@,, ay), 
0, f(a, t) a, =>. (04 Ag.,+ 83, Ag: y) =F $ 2 (07,0507 exacts 
ie Pa CnC ay + 20796CgcxCe, p: yIS(@, t). (7.7.60) 


In the space of doubled dimensions, this is a FPE with positive semidefinite 
diffusion. For, we have for the variable (@,, @,) the drift vector 


Aa) = [A,(a), A,(a)) (7.7.61) 
and the diffusion matrix 
Ba) = ape sae = ¥(a)¢(a)" (7.7.62) 
OF OF in ors OF: 
where 
¢(a) = . ; (7.7.63) 
C, 0 


so that Ale) is explicitly positive semidefinite. 


b) Stochastic Differential Equation (SDE) 
Corresponding to the drift and diffusion (7.7.61, 62) we have a stochastic differential 
equation 


da, A,(a) C,dW(t) 
| | _ | | en | (7.7.64) 
da,| | Aa) |" * [ed Me) 


where W(t) is a Wiener process of the same dimension as @,. Note that the same 
Wiener process occurs in both lines because of the two zero entries &(a@) as written 
in (7.7.63). 

Recombining real and imaginary parts, we find the SDE for the complex 
variable a: 


da = A(a)dt + C(a)dW(t) . (7.7.65) 


This is of course, exactly the same SDE which would arise if we used the usual 
rules for converting Fokker-Planck equations to stochastic differential equations 
directly on the Poisson representation FPE (7.7.9), and ignored the fact that C(a) 
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so defined would have complex elements if B was not a positive semidefinite diffu- 
sion matrix. 


c) Examples of Stochastic Differential Equations in the Complex Plane 
We again consider the reactions (sect. 7.7.b) 


(7.7.66) 


The use of the positive Poisson representation applied to this system yields the 
SDE, arising from the FPE (7.7.16): 


da = [k,C + (k,A — k,B)a — kya? |dt +[2(k,Aa — kyo?)}"*dw(t). (7.7.67) 


In the case 6 >0, we note that the noise term vanishes at a= O and at 
a = k,A|/k,, is positive between these points and the drift term is such as to return 
@ to the range [0, k,A/k,] whenever it approaches the end points. Thus, for 6 > 0, 
(7.7.67) represents a real SDE on the real interval [0, k,A/k,]. 

In the case 6 < 0, the stationary point lies outside the interval [0, k,A/k,], and 
a point initially in this interval will move along this interval governed by (7.7.67) 
until it meets the right-hand end, where the noise vanishes and the drift continues to 
drive it towards the right. One leavingthe interval, the noise becomes imaginary and 
the point will follow a path like that shown in Fig. 7.4 until it eventually reaches 
the interval (0, k,A/k,] again. 

The case of 6 = 0 is not very dissimilar, except that once the point reaches the 
right-hand end of the interval [0, k,A/k,], both drift and diffusion vanish so it re- 
mains there from then on. 

In the case of the system 


Bee y¥ 
2X— A, (7.7.68) 


Fig. 7.4. Path followed by a point obeying the 
stochastic differential equation (7.7.67) 


Fig. 7.5. Simulation of the path of a point obeying 
the stochastic differential equation (7.7.69) > 


7.7 The Poisson Representation 289 
the SDE coming from the FPE (7.7.41) ts 
dnidt = k, — 2kzn? + 1e(2k,) '*ne(t) , (7.7.69) 
where a = nV ande = V™!?., 


The SDE (7.7.69) can be computer simulated and a plot of motion in the com- 
plex 7 plane generated. Figure 7.5 illustrates the behaviour. The point is seen to 
remain in the vicinity of Re {a} = (a/2)'’* but to fluctuate mainly in the imaginary 
direction on either side, thus giving rise to a negative variance in a. 


7.7.5 Time Correlation Functions 


The time correlation function of a Poisson variable @ is not the same as that for 
the variable x. This can be seen, for example, in the case of a reaction X¥ 4 Y which 
gives a Poisson Representation Fokker-Planck equation with no diffusion term. 
Hence, the Poisson variable does not fluctuate. We now show what the relationship 
is. For clarity, the demonstration is carried out for one variable only. 

We define 


<a(t)a(s)> = J du(a)du(a')aa’f(a, tla’, s) f(a’, s). (7.7.70) 
We note that 

fla, s\a’, s) = 8,(a — a’) 
which means that 

{ dula) e-@(a*/x!)f(a, s|a’, s) = e7#'a'*/x! (7.7.71) 
so that 


J du(a) af(a, t\a’, s) = >) xP(x, t|x’, s)e~@a'x'/x"! 


x,x! 


Hence, 
<a(t)a(s)> = p>: xP(x, t| x’, s) f du(a’)(a*'t!e—2'/x"!) f(a’, 5s) 


= Dy xP, #129) J du’)) (—a’ o +x] (a'*'e*'/x"') fla’, s) 


= 3) xx'P(x, tx’, )PO, 8) ; (7.7.72) 
~ § dua’) fla’, s)a’ A 3 xPCx, tx, )aer*'x!). (7.7.73) 

We define 
(a(t)|[a’, s!) = f daafla, tla’, s) (7.7.74) 
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as the mean of a(t) given the initial condition a’ at s. Then the second term can be 
written 


— § dydala’ 2 alt) Ifa’, s)pfla’, 8) = (a! 2; a(t) Ifa, 51>) (7.7.15) 
SO we have 
Cx(0)(5)> = Ka(tdats)y + (a Calt)Ifa, slp) . (7.7.16) 


Taking into account a many-variable situation and noting that 


(x(t) = <a(t)> always, 


we have 


Crt), 249) = Cant), eal) + (ah 325 CaaCO) Ifa’, I>) . (7.1.77) 


This formula explicitly shows the fact that the Poisson representation gives a 
process which is closely related to the Birth-Death Master equation, but not 
isomorphic to it. The stochastic quantities of interest, such as time correlation 
functions, can all be calculated buf are not given directly by those of the Poisson 
variable. 


a) Interpretation in Terms of Statistical Mechanics 

We assume for the moment that the reader is acquainted with the statistical 
mechanics of chemical systems. If we consider a system composed of chemically 
reacting components A, B, C, ..., the distribution function in the grand canonical 
ensemble is given by 


P(I) = exp {A[Q + 2 x1) — EW}, (7.7.78) 


where 7 is an index describing the microscopic state of the system, x,(J) is the 
number of molecules of X, in the state J, E(/) is the energy of the state, py; is the 
chemical potential of component X,, 2 is a normalization factor, and 


B= 1/kT. (7.7.79) 


The fact that the components can react requires certain relationships between the 
chemical potentials to be satisfied, since a state 7 can be transformed into a state 
J only if 


Di vtx() = Divfx(V), A= 1, 2,3,... (7.7.80) 


where v4 are certain integers. The relations (7.7.80) are the stoichiometric con- 
straints. 
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The canonical ensemble for a reacting system is defined by requiring 


> v7x) = 14, (7.7.81) 
for some t4, whereas the grand canonical ensemble is defined by requiring 


2 P(I) 21 vixi(D) = Dvr) =, (7.7.82) 


Maximization of entropy subject to the constraint (7.7.82) (and the usual con- 
straints of fixed total probability and mean energy) gives the grand canonical 
form (7.7.78) in which the chemical potentials also satisfy the relation 


Hi = 2 k vi . (7.7.83) 


When one takes the ideal solution or ideal gas limit, in which interaction ener- 
gies (but not kinetic or internal energies) are neglected, there is no difference 
between the distribution function for an ideal reacting system and an ideal nonre- 
acting system, apart from the requirement that the chemical potentials be ex- 
pressible in the form of (7.7.83). 

The canonical ensemble is not so simple, since the constraints must appear ex- 
plicitly as a factor of the form 


IT (0 efx), 4] (7.7.84) 
A i 
and the distribution function is qualitatively different for every kind of reacting 
system (including a nonreacting system as a special case). 


The distribution in total numbers x of molecules of reacting components in the 
grand canonical ensemble of an ideal reacting system Is easily evaluated, namely, 


P(x) = exp [B(@ +O 4.x) ETI 8x0), xd exp [— BE). (7.7.85) 


The sum over states is the same as that for the canonical ensemble of an ideal non- 
reacting mixture so that 


P(x) = exp[B(@ + Si mxd] I <5 (Sexp [-BEON™ (7.7.86) 


where £,(i) are the energy eigenstates of a single molecule of the substance A. This 
result is a multivariate Poisson with mean numbers given by 


log<x,) = Bu; — log [31 e-PF) (7.7.87) 


which, as is well known, when combined with the requirement (7.7.82) gives the 
law of mass action. 
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The canonical ensemble is obtained by maximizing entropy subject to the 
stronger constraint (7.7.81), which implies the weak constraint (7.7.82). Thus, 
the distribution function in total numbers for the canonical ensemble will simply 
be given by 


P(x) oc [IT — [ei eA} 3 BLD vty 4. (7.7.88) 


In terms of the Poisson representation, we have just shown that in equilibrium 
situations, the quasiprobability (in a grand canonical ensemble) is 


S(@)eq = Bla — e(eq)] (7.7.89) 


since the x space distribution is Poissonian. For the time correlation functions there 
are two results of this. 


i) The variables a(t) and a(s) are nonfluctuating quantities with values a(eq). Thus, 


<a,(t), as(5)>eqg = 0. (7.7.90) 


11) The equilibrium mean in the second term is trivial. Thus, 


C(t), 60) = [as 55; Caalt) Ifa’, sD | (7.7.91) 


al=a(eg) ° 
This result is, in fact, exactly thattof Bernard and Callen [7.11] which relates a two- 
time correlation function to a derivative of the mean of a quantity with respect to 
a thermodynamically conjugate variable. 

Consider a system in which the numbers of molecules of chemical species X,, 
X,, ... corresponding to a configuration I of the system are x,(J), x(/). ... and it is 
understood that these chemical species may react with each other. Then in a grand 
canonical ensemble, as demonstrated above, the equilibrium distribution function is 


Zu) exp [{5 wx) — EW} ETI (7.7.92) 
with 
Z(p) = exp (—2Q£), (7.7.93) 


where Z(z) is the grand canonical partition function. As pointed out above, the 

chemical potentials u, for a reacting system cannot be chosen arbitrarily but must 

be related by the stoichiometric constraints (7.7.82) of the allowable reactions. 
Now we further define the quantities 


(x, t (UZ, sp (7.7.94) 


to be the mean values of the quantities x, at time ¢ under the condition that the 
system was in a configuration / at time s. Then a quantity of interest is the mean 
value of (7.7.94) over the distribution (7.7.92) of initial conditions, namely, 
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«x, t{[w, s)> = 2 (xi, C|L, sJ>Z7"(#) 
x exp a> Ljxj(J) — EV}. (7.7.95) 


When the chemical potentials satisfy the equilibrium constraints, this quantity will 
be time independent and equal to the mean of x, in equilibrium, but otherwise it will 
have a time dependence. Then, with a little manipulation one finds that 


kr doc, t | [ot D| = GO (Deo. (7.7.96) 


B=pu (eg 


The left-hand side is a reponse function of the mean value to the change in the 
chemical potentials around equilibrium and is thus a measure of dissipation, while 
the right-hand side, the two-time correlation function in equilibrum, is a measure 
of fluctuations. 

To make contact with the Poisson representation result (7.7.91) we note that the 
chemical potentials yu, in ideal solution theory are given by 


LA<x>) = kT log <x;) + const. (7.7.97) 


Using (7.7.97), we find that (7.7.96) becomes 


C0), 268) = [C0 goes Co tL) SD] (7.7.98) 


CX) =(Xeq 


Since the ideal solution theory gives rise to a distribution in x, that is Poissonian, it 
follows that in that limit 


(x, t|[u(<e)), s = <ay, tl[e’, s]> (7.7.99) 


with a’ = <x). Thus, (7.7.98) becomes 


Cxilt), x40)) = [al 5 Ca tle’, SD] (7.7.100) 


al=a(eq) 


Thus, (7.7.91) is the ideal solution limit of the general result (7.7.98). 

The general formula (7.7.77) can be considered as a generalization of the Bern- 
ard-Callen result to systems that are not in thermodynamic equilibrium. 

However, it 1s considerably different from the equilibrium result and the 
two terms are directly interpretable. The second term is the equilibrium contribu- 
tion, a response function, but since the system is not in a well-defined equilibrium 
State, we take the average of the equilibrium result over the various contributing @ 
space states. The first term is the contribution from the a-space fluctuations them- 
selves and is not directly related to a response function. It represents the fluctuations 
in excess of equilibrium. 
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b) Linearised Results 
The general differential equation, arising from the use of the positive Poisson 
representation, and corresponding to the FPE (7.7.12), is 


dyn = A(n)dt + eC(n)dWt), (7.7.101) 
where 
CCT=B. (7.7.102) 


We may now make a first-order small noise expansion about the stationary state 
n by following the procedure of Sect.6.3. Thus, writing 


nt)=A+en(t) («= Vv") (7.7.103) 


to lowest order we have 


A(ji) = 0 (7.7.104) 

dy, = —Fn,dt + GdW(t) 
where 

O02 Sine 

| oe an, A,() (7.7.105) 
Then, 

<a,(t), a,(0)>, =V 2 [exp (— Ft)],,. CRAP Ns.iDs (7.7.106) 
and 

sep Cau(t) la’, O1Y = 52 dealt) Lai, OW) = Lexp (— Fe) (7.7.107) 
Hence, 

«x,(t), x,(0))s = V2 [exp (—Ft)), LM. Ns. s ae O,1,sM%1s) (7.7. 108) 

= 21 exp (— Ft), Xr Xa)s - (7.7.109) 


Thus the linearised result is in agreement with the regression theorem of sect. 3.7.4 
correlation functions for a variety of systems nave been computed in [7.10]. 


7.7.6 Trimolecular Reaction 
In Sect. 7.1.3 we considered a reaction which included a part 


A+2X¥==3X (7.7.110) 


/./ ihe roisson Kepresentauon LID 


and set up an appropriate birth-death Master equation for this. However, it is well 
known in chemistry that such trimolecular steps are of vanishingly small probability 
and proceed in stages via a short-lived intermediate. Thus, the reaction (7.7.110) 
presumably occurs as a two-state system 


1) A+ ie Y (7.7.111a) 
es Y 
ii) Y a 2X , (7.7.111b) 


both of which are merely bimolecular, and we have set rate constants equal to one, 
except for y (the decay constant of Y) which is assumed as being very large. Thus, 
Y is indeed a short-lived intermediate. The deterministic rate equations are 


d 
ap = ay — xy + Avy — x?) 
(7.7.112) 
dy _ ya 
dt =x yV 


and the usual deterministic adiabatic elimination procedure sets y = x?/y and gives 
x 
— = (ax? — x3)fy. (7.7.113) 


Although this procedure is straightfoward deterministically, it is not clear that 
the stochastic Master equation of the kind used in Sect.7.1.3 is a valid adiabatic eli- 
mination limit. The adiabatic elimination techniques used in Chap. 6 are not easily 
adapted to direct use on a Master equation but can be straightfowardly adapted 
to the case of the Poisson representation Fokker-Planck equation. 


a) Fokker-Planck Equation for Trimolecular Reaction 

For the reaction (7.7.110) with forward and backward rate constants equal to 1/y 
to correspond to (7.7.113), the Poisson representation Fokker-Planck equation 
becomes, from (7.7.4), 


| on 


a a? 
is ( (7.7.114) 


~2425,— S)tea— af 


and contains third-order derivatives. There is no truly probabilistic interpretation 
in terms of any real stochastic process in a space, no matter what kind of Poisson 
representation is chosen. The concept of third-order noise will be explained in the 
next section, which will show how probabilistic methods and stochastic differential 
equations can still be used. 


b) Adiabatic Elimination 


Using the rules developed in (7.4.9), the Fokker-Planck equation for the system 
(7.7.111) with the correspondence 
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xX a 
<_> 
y B 


Fe = — gall — @)B + 296 — o) + S08 — at 
(7.7.115) 


o2 ‘ Oo? 
Tigger a ag — @)B). 
Adiabatic elimination now proceeds as in Sect.6.6.1. We define new variables 


X=@a@ 


syle (7.7.116) 
and consequently, changing variables with 
a_a_, a 
Oe “ae oy (7.7.117) 
Cee 
ap y ay’ 
the FPE becomes 
of(x, 0 0 \ ((ate : 0 
a habe on | eld a 
; ; (7.7.118) 
9 45, 9\ (9 4,9 G4, 9\ (0 2 
+ (5 — 25) le 2a)” + Ge 2*55) (5) IO + 2 — I] F: 


Since y is to be eliminated, there should be a well-defined limit of the ZL, operator 
which governs its motion at fixed x. However, this operator is 


2 
y = y+ e [4x?y — 2x(y + x*Ya — x)] (7.7.119) 


and the large y limit turns this into deterministic motion. Setting 
y=oy'? (7.7.120) 


transforms( 7.7.119) to 


Li) = {20 + £5 [x(x — a) + (4x? — 2xyvy"I 


Ov 
yl, . (7.7.121) 


0 (ee 
ay Fs v+ ap Lax — a)] 


‘ef ALIS (OUISSVL INCPLeoLLiLalIUds “asi 


With this substitution, we finally identify 


y'L;=—y"' 2 [x*(a — x)] (7.7.122) 
Ly) = — 2 (a — xyvy"3/* + Qvy 7] — J axZ — 2x2 Sy 
+r ont pe oa — 8 + oy?) (7.7.123) 
and 
Fy Ls + La) + rLa)MS. (7.7.124) 


The projection operator P will be onto the null space of Z, and because L, depends 
on x, we have 


L,P # PL;. (7.7.125) 


This means that the equation of motion for Pf = g is found by similar algebra to 
that used in Sect. 6.5.4. We find 


sa(s) = y'PL32(s) + P[L.(y) + y'Lalls — yLi—- (1— P)L2(y)—y "(1 — P)L,) 
x [LA(y) + y "Cl — P)Ls]8(s) + g(0). (7.7.126) 


Notice, however, since for any function of v 


Pd(v) = p,(v) { dv d(v) (7.7.127) 
where p,(v) satisfies 
L,p,{v) = 0, (7.7.128) 


that in PL,(y), all terms with 0/dv in them vanish. Thus, to highest order in y, 


u a ) (7.7.129) 


—~ ay—1/2 ie 
PLAy) = y (— 2v ax + y— ax? 


The term [ ]~? in(7.7.126) is asymptotic to —y~'Z;! and the only term in the re- 
maining bracket which can make the whole expression of order y™', like the L, 
term, is the term of order y'/? in L,(y), 1.e., 


(7.7.130) 


) 7) 
1/2 x x2 
? Ox l(a ) Ov 


Thus, the large y limit of (7.7.126) is 
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sa(s) = [Pag — P[-2 2 + Fa] oti F(a — ox 2 ]p,cop} + 8) 


(7.7.131) 
where we have written 
g=p.(v)p, &=p,(v)p. (7.7.132) 
We are now lead to the central problem of the evaluation of 
f dv'v'L7} < (a — x)x? Aa px(v’) (7.7.133) 
x v 


which arises in the evaluation of the second part in the braces in (7.7.131). We wish 
to bring the 0/dx to the left outside the integral, but since d/dx and L, do not com- 
mute, this requires care. Now 


a a 
te a re Ee 1, | Lo (7.7.134) 
and from (7.7.121), 


— or Fe [8x3 — 6ax?] 
' lav? 


Li; (7.7.135) 
¢ 


age ‘fT -1 a 2 Oo , 
(7.7.133) = ay I dv'v'Lz'(a — x)x 5,1 Px’) 


2 
+ [ dv'v'L;' we Ly'{(8x? — 6ax*)(a — x)x?] 2 pv’). (7.7.136) 


1 ov’? 


The second term vanishes, through the demonstration that this 1s so is rather speci- 
alised. For, we know that L, describes an Ornstein-Uhlenbeck process in v and that 
p,(v) is its stationary solution. The eigenfunction properties used in Sect.6.4.2 show 
that 


Le a | oe 2 (v) (7.7.137) 
1 Out Qu P= - 

is proportional to the third eigenfunction, which is orthogonal to v, the first eigen- 

function of the corresponding backward equation. The first term is now easily 

computed using the fact that L, involves the Ornstein-Uhlenbeck process. Using the 

same techniques as in Sect.6.6.1, we find that all the x dependence arising from 

p,(v’) vanishes, and hence 


(7.7.133) = 2 (a — x)x?. (7.7.138) 


We similarly find 
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t 7) 2 '\ x 
PL,Z = —p,(v) { dv an [x*(a — x)]p.(v')p 


O 
ae —Px(v) = [X? (a — x)] B ’ (7.7.139) 
so in the end 


A DC Oe pe ae 8 
dt y i et aR 5a) la Ip] ae 


which is exactly the same as the trimolecular model Fokker-Planck equation 
(7.7.114). This means the trimolecular Master equation is valid in the same limit. 


c) Comments 


1) Notice that this system gives an end result which is not in a Stratonovich form 
but in the Ito form, with all derivatives to the left. 

ii) The derivation of (7.7.140) means that techniques for understanding such non- 
probabilistic Fokker-Planck equations are required. We outline a possible way 
of doing this in the next section. 


7.7.7 Third-Order Noise 


To handle the third-order Fokker-Planck equations which arise with trimolecular 
reactions, we introduce the stochastic variable V(t) whose conditional probability 
density p(v, t) obeys the third-order partial differential equation 


Op(v, t)/ot = — $ d’p(v, t)/dv>. (7.7.141) 


Since we have already shown in Sect.3.4 that no Markov process can possible give 
a third-order term like this, some fundamental requirement must be violated by 
p(v, t). It turns out that p(v, t) is not always positive, which is permissible in a quasi- 
probability. We will see that in spite of this, the formal probabilistic analogy is 
very useful. 

We know that the solution of (7.7.141), subject to the boundary condition 


P(Y, to) = S(v — v9), (7.7.142) 
is given by Fourier transform methods as 
PV, t| Vo, to) = (1/2m) J dq exp {i[g(v — vo) + § 9°(t — to)]}- (7.7.143) 


The moments of V can be calculated, after a partial integration, to be 


(V(t) — Vv.) = 0, n not a multiple of 3 
(V(t) — Vop™ = [(t — to)/6}"(3m)!/m! . (7.7.144) 
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Further, we assume the process (7.7.141) is some kind of generalized Markov 
process, for which the joint probability distribution is given by 


P(V2t2: V1b1) = P(V2t2| Vit) pM, ti) (7.7.145) 


and from (7.7.142) we see that the first factor is a function of only v, — v, and 
t, — t,, so that the variable V(t,) — V(t,) is statistically independent of V(t,) and 
that this process is a process with independent increments. Thus, dV(t) will be 
independent of V(t). 

The rigorous definition of stochastic integration with respect to V(t) is a task 
that we shall not attempt at this stage. However, it is clear that it will not be too 
dissimilar to Ito integration and, in fact, Hochberg [7.12] has rigorously defined 
higher-order noises of even degree and carried out stochastic integration with 
respect to them. We can show, however, that a stochastic differential equation of 
the form 


dy(t) = aly)dt + b(y)dW(t) + c(y)dV(t) (7.7.146) 


[with W(t) and V(t) independent processes] is equivalent to a third-order Fokker- 
Planck equation. It is clear that because W(t) and V(t) are processes with inde- 
pendent increments, y(t) is a Markov process. We then calculate 


tim WO) = od” tim Geo"? (7.7.147) 


tt t— lo dtg—0 lo 


where y(f,) is a numerical initial value, not a stochastic variable. From (7.7.146), 
y(t) depends on W(t’) and V(t’) for only t’ < ¢t and, since dW(t) and dV(t) are inde- 
pendent of y(t), we find 


Kdy(to)> = Cal v(to)>dto + <O[ (to) <dW(to)> + <cly(to) > <dV (to) 
= Cal y(to)|dto = al (to) ]dto (7.7.148) 


because y(f,) is a numerical initial value. Similarly, to lowest order in dt 


Cdy(to)’) = B[y(to)]?<dW (to)? 


= b[ W(t.)]*dto (7.7.149) 
<dy(to)*> = cl y(to)]?<dV(to)”> 
= C[y(to) Pdty . (7.7.150) 


Thus, we find 
um [<y(t) — Wto)> M(t — to)] = al ¥(t0)] 
hk [<Ly(t) — veto) P>/(t — to)] = SL (to)? (7.7.151) 
lim [<] y(t) — v(to)P>/M(t — to)] = cl (to) ]? 
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and all higher powers give a zero result. By utilising a similar analysis to that of 
Sect.3.4, this is sufficient to show that y(t) is a generalized diffusion process whose 
generalized FPE is 


2 l 3 
orp) — — F tatvpl + 5 gO] — | ler]. (7.7.152) 


We define a noise source C(t) by 


dV(t) = C(t)dt , (7.7.153) 
where 
CC(t)> = (C(t) (t’)> = 0 (7.7.154) 
COWELL") = Ot — #’)S(t" — t”) (7.7.155) 


and higher moments can be readily calculated from the moments of dV(t). The 
independence of increments means that, as with the Ito integral, integrals that 
have a delta-function singularity at their upper limit are to be taken as zero. 


Example of the Use of Third-Order Noise. Consider the chemical process 


k 
A + 2X == 3X (7.7.156) 
k2 
k3 
ten 


whose Poisson representation FPE ts 


fle) 8 tg y-igt — V0? + KV — eee) fles 
1 oO -1.2 -2,3 
+ > 33 [40aV 1a? — 1. V 70) fla, 1)) 
1 0 = 2,3 
ae 31 aaS [6(x, Vq* — K,V “a f(a, t)] , (7.7.157) 


Where k,V7!' = k,A, w.V7=k, n,V=k3, Ky = ky. 

In the steady state, (7.7.157) reduces to a linear second-order differential equa- 
tion which may be solved in terms of hypergeometric functions, and an asymptotic 
expansion for the various moments can be obtained using steepest descent methods. 
This procedure, although possible in principle, is not very practicable. It is in such 
cases that the method of stochastic differential equations proves to be very useful 
because of its ease of application. 
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The stochastic differential equation equivalent to (7.7.157) is 


dn(t)/dt = k,n(t)? — Kan(t)? + Ks — Kan(t) 
+ pe {4[in(t)? — Kan(t) I} 72C(1) 
+ pt {6[K y(t)? — Kan(t)} Ct), (7.7.158) 
where a = nV, w = V~''® and the noise source C(t), henceforth referred to as the 


“third-order noise’, has been defined in (7.7.153-155) 
Equation (7.7.158) may be solved iteratively by expanding (tf): 


n(t) = not) + pent) + utnd(t) + went) + ueye(t) + wend(t) +... (7.7.159) 


which, when substituted in (7.7.158), yields the deterministic equation in the lowest 
order and linear stochastic differential equations in the higher orders which may 
be solved as in Sect.6.2. 

In the stationary state the results are 


2 
Kx) = Ving + (No) + 00 = Vino + “ +... (7.7.1602) 


Kx) — Cx?) = Vend> + [2<nons> + 2<nena> + (nd — (nd? + Cn] + -.- 
ee: 28 a*b? , 8ab’no 36x20” 
adedaal 


Eu | 4+... (7.7.160b) 


Cc 3 C C3 3 
(x — «x))> = Viknd> — 3<nd><ned + 3X3) + (nod + --- 
= |= — ue + no| +o, (7.7.160c) 


where a = k,n2 — K2ni, b = 2k, — 3K 2M, C = Ky — 2K\Mo + 3K2n2 and ny is the 
solution of the steady-state deterministic equation 


K\n2 -o Kone + K3— K4Mo = 0 . (7.7.161) 


Here a few remarks are in order. The third-order noise C(t) contributes to O(V—') to 
the mean and to O(1) to the variance, but contributes to O(V) to the skewness 
coefficient. If one is only interested in calculating the mean and the variance 
to O(V), the third-order noise may be dropped from (7.7158) and the expansiou 
carried out in powers of e = V~'!?. Also note that as c — 0, the variance and the 
higher order corrections become divergent. This of, course, is due to the fact that in 
this limit, the reaction system exhibits a first-order phase transition type behaviour. 


8. Spatially Distributed Systems 


Reaction diffusion systems are treated in this chapter as a prototype of the host 
of spatially distributed systems that occur in nature. We introduce the subject 
heuristically by means of spatially dependent Langevin equations, whose inade- 
quacies are explained. The more satisfactory multivariate master equation descrip- 
tion is then introduced, and the spatially dependent Langevin equations formulated 
as an approximation to this description, based upon a system size expansion. It is 
also shown how Poisson representation methods can give very similar spatially 
dependent Langevin equations without requiring any approximation. 

We next investigate the consequences of such equations in the spatial and 
temporal correlation structures which can arise, especially near instability points. 
The connection between local and global descriptions is then shown. The chapter 
concludes with a treatment of systems described by a distribution in phase space 
(i.e. the space of velocity and position). This is done by means of the Bo/tzmann 
Master equation. 


8.1 Background 


The concept of space is central to our perception of the world, primarily because 
well-separated objects do not, in general, have a great deal of influence on each 
other. This leads to the description of the world, on a macroscopic deterministic 
level, by /oca/ quantities such as local density, concentration, temperature, electro- 
magnetic potentials, and so on. Deterministically, these are normally thought of as 
obeying partial differential equations such as the Navier-Stokes equations of 
hydrodynamics, the reaction diffusion equations of chemistry or Maxwell’s equa- 
tions of classical electromagnetism. 

The simplest cases to consider are reaction diffusion equations, which describe 
chemical reactions and which form the main topic of this chapter. In order to get 
some feel of the concept, let us first consider a Langevin equation description for 
the time evolution of the concentration p of a chemical substance. Then the 
classical reaction-diffusion equation can be derived as follows. A diffusion current 
J(r, t) exists such that 


i(r, t) = —DP p(r, t) (8.1.1) 


and (8.1.1) is called Fick’s law. If there is no chemical reaction, this current obeys 
a conservation equation. For, considering an arbitrary volume V, the total amount 
of chemical in this volume can only change because of transport across the bound- 
ary S, of V. Thus, if NV is the total amount in V, 
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dN_d ; 
ag hE Ae = — [aS 50,9 (8.1.2) 
V 


= — | dr V -j(r,t). 
Hence, since V is arbitrary, 
0p(r,t) + V-y(r, t) = 0. (8.1.3) 
Substituting Fick’s law (8.1.1) into the conservation equation (8.1.3) we get 
0,p(r, t) = DV’ p(r, t), (8.1.4) 


the diffusion equation. Now how can one add fluctuations? First notice that the 
conservation equation (8.1.3) is exact; this follows from its derivation. We cannot 
add a fluctuating term to it. However, Fick’s law could well be modified by adding 
a stochastic source. Thus, we rewrite 


i(r,t) = —DP pr, t) + falr, t) - (8.1.5) 


Here, f,(r, t) is a vector Langevin source. The simplest assumption to make con- 
cerning its stochastic properties is 


(fale, t)> = 0 


and 


fade, Ofai0's 1) = Kale, 1dr — rot — ¢’), (8.1.6) 


¢ 


that is, the different components are independent of each other at the same time 
and place, and all fluctuations at different times or places are independent. This is 
a locality assumption. The fluctuating diffusion equation is then 


0,p(r, t) = DV*p(r, t) —V-falr, t). (8.1.7) 
Notice that 
CV -falr, tV' - fale’, CO) = 0-0 TKa(r, or — r')Jo(t — t’). (8.1.8) 


Now consider including a chemical reaction. Fick’s law still applies, but instead 
of the conservation equation we need an equation of the form 


a= < J dr pr, 1) = — [dS-jr.) + Jar Flot, 1), (8.1.9) 


where F[p(r, t)] is a function of the concentration and represents the production 
of the chemical by a local chemical reaction. 
Hence we find, before taking fluctuations into account, 


0,p(r, t) + V-j(r, t) = Flo(r, tI. (8.1.10) 


8.1 Background 3U5 


The production of the chemical by a chemical reaction does generate fluctuations, 
so we can add to (8.1.10) a term f,(r, t) which satisfies 


f(r, t)> = 0 


-_ (8.1.11) 
Chr, Of Cr’, t')> = Kr, t)d(r — r')d(t — t’) 


which expresses the fact that the reaction 1s local (i.e., fluctuations at different points 
are uncorrelated) and Markov (delta correlated in time). The full reaction-diffusion 
chemical equation now becomes 


0,p(r, t) = DV*plr, t) + Flolr, t)] + g(r, ¢) (8.1.12) 


+} 
where 


g(r, t) = —V-far, t) + flr, ¢) (8.1.13) 


and 


(g(r, tg(r’, t’={K (r—-r', Cd(r—r')+-V -V'[Ka(r, t)d(r—r’)]} 8(t — t’). | (8.1.14) 


The simplest procedure for turning a classical reaction diffusion equation into a 
Langevin equation yields a rather complex expression Further, we know nothing 
about K,(r) or K,(r), and this procedure is based on very heuristic models. 

Nevertheless, the form derived is essentially correct in that it agrees with the 
results arising from a more microscopic approach based on Master equations, 
which, however, specifies all arbitrary constants precisely. 


8.1.1 Functional Fokker-Planck Equations 


By writing a stochastic partial differential equation such as (8.1.12), we immediately 
raise the question: what does the corresponding Fokker-Planck equation look like? 
It must be a partial differential equation in a continuously infinite number of 
variables p(r), where r is the continuous index which distinguishes the various 
variables. A simple-minded way of defining functional derivatives is as follows. 
First, divide space into cubic cells of side / labelled i with position r,, and introduce 
the variables 


x, = Po(r,) (8.1.15) 


and consider functions of the variables x = {x,}. 

We now consider calculus of functions F(x) of all these cell variables. Partial 
derivatives are easily defined in the usual way and we formally introduce the 
functional derivative by 


SFO) _ jim j-3 OF) 


6p(ri) 1-0 OX, (8.1.16) 
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In what sense this limit exists is, in most applied literature, left completely unde- 
fined. Precise definitions can be given and, as is usual in matters dealing with 
functionals, the precise definition of convergence is important. Further, the 
“obvious”’ definition (8.1.16) is not used. 

The precise formulation of functional calculus is not within the scope of this 
book, but an indication of what is normally done by workers who write such 
equations is appropriate. Effectively, the functional derivative is formally defined 
by (8.1.16) and a corresponding discretised version of the stochastic differential 
equation such as (8.1.12) is formulated. Using the same notation, this would be 


dx, = [D) Dix, + F(x,)] dt + a 2,dW,(t). (8.1.17) 


In this equation, D;, are coefficients which yield a discretised approximation to DV’. 
The coefficients F and Z are chosen so that 


F[p(r,, t)] = lim F(x) /73 (8.1.18) 
1-0 
g(r, t) = lim 173 3) g,dW,(t) . (8.1.19) 
i-0 J 


More precisely, we assume a more general correlation formula than (8.1.14), 1.e., 


<a(r, t)g(r’, t)> = Gr, r’)d(t — t’), (8.1.20) 
and require . 
G(r, r;) = lim I~* ba Fink jx - (8.1.21) 


In this case, the FPE for x, is 
a,P(x) = —S 2 (Dy x, + dyFxdP@) + OA 2 gue P(x). (8.1.22) 
oo ij Bx, Wy f : ijk 2 OX,OX, Rede ; — 


Now consider the limit /? -- 0. Some manipulation gives 


,) 
dp(r) 


+4 ffdr ar'| 


0,P(p) = —Jfd'r {[DV?(r) + Flp(r)]]P(p)} 


6? ; 

P(p) is now a kind of functional probability and the definition of its normalisation 
requires a careful statement of the probability measure on p(r). This can be done 
[8.1] but what is normally understood by (8.1.23) is really the discrete version 
(8.1.22), and almost all calculations implicitly discretise. 

The situation is clearly unsatisfactory. The formal mathematical existence of 
stochastic partial differential equations and their solutions has now been establi- 
shed, but as an everyday computational tool this has not been developed. We 
refer the reader to [8.1] for more information on the mathematical formulation. 
Since, however, most work is implicitly discretised, we will mostly formulate 
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matters directly in a discretised form, using continuum notations simply as a 
convenience in order to give a simpler notation. 


8.2 Multivariate Master Equation Description 


8.2.1 Diffusion 


We assume that the space is divided into cubic ce//s of volume AV and side length 
|. The cells are labelled by an index i and the number of molecules of a chemical 
X inside cell 7 is called x,, Thus we introduce a multivariate probability 


P(x, t) = P(x1, Xa, --. t) = P(x, ¥, t). (8.2.1) 


In the last expression, ¥ means the vector of all x’s not explicitly written. 

We can model diffusion as a Markov process in which a molecule is transferred 
from cell i to cell 7 with probability per unit time d,,x,, 1.e., the probability of 
transfer is proportional to the number of molecules in the cell. For a strictly local 
description, we expect that d,, will be nonzero only when i and j are neighbouring 
cells, but this is not necessary and will not always be assumed in what follows. 

In terms of the notation of Sect. 7.5, we can write a birth-death Master equation 
with parameters given by the replacements: 


(8.2.2) 


Hence, the Master equation becomes 
0,P(x, t) = SY d,l(, + DP, x, + 1, x, — 1, t) — x,P(x, 1]. (8.2.3) 


This equation is a simple linear Master equation and can be solved by various 
means. 
Notice that since 


pbD = po 


(8.2.4) 
NGP = MU , 


we can also restrict to i > j and set kj y = dj, From (7.5.15,18) we see that in 
this form, detailed balance is satisfied provided 


Ai j<X)s = dy CX) - (8.2.5) 
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In a system which is diffusing, the stationary solution is homogenous, i.e., 


(Xi = (Xp) 5° 


Hence, detailed balance requires 
dy = dy, (8.2.6) 


and (8.2.3) possesses a multivariate Poisson stationary solution. 
The mean-value equation 1s 


didxdt)) = Tirta) — Gel@)>) 
= 2 (—di, ae Ork) djx<X;) . 


Hence, 


d<x(t)> = 2 (dy: — Oy 2 jn)<X,(t)) (8.2.7) 
= 2 D,Xx,(t)> . (8.2.8) 


8.2.2 Continuum Form of Diffusion Master Equation 


Suppose the centre of cell i 1s located at r, and we make the replacement 


x(t) = Fore t) : (8.2.9) 
and assume that d,, = 0 (i, 7 not nearest neighbours) 
=d (i, 7 adjacent). 


Then (8.2.8) becomes, in the limit /—- 0, 
0,<p(r, t)> = DV*<p(r, t)> with (8.2.10) 
D=lI*d. (8.2.11) 


Thus, the diffusion equation is recovered. We will generalise this result shortly. 


a) Kramers-Moyal or System Size Expansion Equations 

We need a parameter in terms of which the numbers and transition probabilities 
scale appropriately. There are two limits which are possible, both of which corres- 
pond to increasing numbers of molecules: 


1) limit of large cells: ] —- co, at fixed concentration; 
11) limit of high concentration at fixed cell size. 


The results are the same for pure diffusion. In either case, 
tit (x) —- co (8.2.12) 


and a system size expansion is possible. To lowest order, this will be equivalent to 
a Kramers-Moyal expansion. From (7.5.31,32) we find 
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j 
Bin{X) = O1m 2 (Dix, i Dj,x;) = DimX1 = DmiXm ) (8.2.14) 
where 
Dj = dj, 7 Oj 2 ai, (8.2.15) 


and thus, in this limit, P(x, t) obeys the Fokker-Planck equation 


0,P = =D O,A(x)P + 4 p> 00 mBin(X)P . (8.2.16) 


b) Continuum Form of Kramers-Moyal Expansion 
The continuum form is introduced by associating a point r with a cell i and writing 


ae { ar (8.2.17) 
D;, — D(r', r) = Dr’, r — r’) (8.2.18) 
1~*6,, > d(r — vr’). (8.2.19) 


At this stage we make no particular symmetry assumptions on D,,, etc, so that 
anisotropic inhomogeneous diffusion is included. 

However, there are some requirements brought about by the meaning of the 
concept “‘diffusion.” 


1) Diffusion is observed only when a concentration gradient exists. This means that 
the stationary state corresponds to constant concentration and from (8.2.13,15), 
this means that 


2. D,=0, (8.2.20) 
1.€., 

2 dj, = 2 d,,. (8.2.21) 
Note that detailed balance (8.2.6) implies these. 
11) Diffusion does not change the total amount of substance in the system, 1.e., 


d 
7 ts = 0 (8.2.22) 


and this must be true for any value of x;. From the equation for the mean values, 
this requires 


di Dy = 0 (8.2.23) 


which follows from (8.2.15) and (8.2.21) 
lii) In the continuum notation, (8.2.20) implies that for any r, 
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{ a6 D(r + 6, —d) =0 (8.2.24) 
and from (8.2.23), we also have 

f a°6 Dr, 6) = 0. (8.2.25) 


iv) If detailed balance is true, (8.2.24) is replaced by the equation obtained by 
substituting (8.2.6) in the definition of D, 1.e., 


D,, = Dy (8.2.26) 
which gives in the continuum form 
QY(r + 6, —d) = Dr, 0). (8.2.27) 


The derivation of a continuum form now follows in a similar way to that of the 
Kramers-Moyal expansion. 
We define the derivate moments 


M(r) = [ d°65H(r, 6) (8.2.28) 
D(r) = 4 { d°6 d5D(r, 5) (8.2.29) 


and it is assumed that derivate moments of higher order vanish in some appro- 
priate limit, similar to those used4ain the Kramers-Moyal expansion. 
The detailed balance requirement (8.2.27) gives 


M(r) = { 25 dD(r + 6, —8) 
= f{ db i. D(r, —d) + 6-VH(r, —4) + ...] (8.2.30) 
= —M(r)+ 27-D(r) + .... 


Hence, detailed balance requires 

M(r) = V-D(r). (8.2.31) 
The weaker requirement (8.2.24) similarly requires the weaker condition 

V -(M(r) —V-D(r)] = 0. (8.2.32) 
We now can make the continuum form of A,(x): 


Afx) — [ d°6 Dr, 6)p(r + 6) 
— M(r)-Pp(r) + Dn): PF p(n) (8.2.33) 


If detailed balance is true, we can rewrite, from (8.2.31) 


A,(x)?V: [D(r)-Vp(r)] . (8.2.34) 
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The general form, without detailed balance, can be obtained by defining 

J(r) = M(r) — V- D(r) (8.2.35) 
from (8.2.32) 

V-J(r) =0 (8.2.36) 
so that we can write 

J(ir)=V-E(r), ; (8.2.37) 
where E(r) is an antisymmetric tensor. Substituting, we find that by defining 

H(r) = D(r) + E(r), (8.2.38) 
we have defined a nonsymmetric diffusion tensor H(r) and that 

A(x) — V- [H(r)-Pp(r,t)] . (8.2.39) 


This means that, deterministically, 


(8.2.40) 


where H(r) is symmetric, if detailed balance holds. 
We now come to the fluctuation term, given by (8.2.14). To compute B,,,(x), 
we first consider the limit of 


PS Bindm — { dr’ Br, r’)d(r’) (8.2.41) 


where g,, is an arbitrary function. By similar, but much more tedious computation, 
we eventually find 


P21 Bingm — —2V -[D(r) p(n) -V9(r)] (8.2.42) 


so that 


(8.2.43) 


The phenomenological theory of Sect. 8.1 now has a rational basis since (8.2.43) 
is what arises from assuming 
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i(r, t) = —B(r)V p(r, t) — Cr, t) (8.2.44) 
in which 


Ce(r, Er’, UY) = 28(t — 1") D(N8(r — r'p(r) (8.2.45) 


and hence, 


0,p(r, t) =V-H(r) Vpt+V-c(r, t). (8.2.46) 


This corresponds to a theory of inhomogeneous anisotropic diffusion without 
detailed balance. This is usually simplified by setting 


D(r) = D1 (8.2.47) 


and this gives a more familiar equation. Notice that according to (8.2.45), fluctua- 
tions in different components of the current are in general correlated, unless D 
is diagonal. 


Comparison with Fluctuation-Dissipation Argument. The result (8.2.43) can almost 
be obtained from a simple fluctuation-dissipation argument in the stationary state, 
where we know the fluctuations are Poissonian. In that case, 


(Xi, X) = (XD) Oy (8.2.48) 


corresponding to 
g(r, r’) = <p(r), plr’)> = O(r — #')<p(r)) . 


Since the theory is linear, we can apply (4.4.51) of Sect. 4.4.6. Here the matrices 
A and A™ become 


A—-—V-H(r)-V 
A’ — —p’'-H(r')-V’. (8.2.49) 
Thus, 


Bir, r') — BB" 


(8.2.50) 
= Ag + GAT —+[—-P-H(r)-V — V'-H(r')-V']e(r, vr’). 


We note that in the stationary state, (p(r)) = <p>, independent of r. Thus, 


Ber, ') = [—-V- Hr) V< pyar — &') — V' A(e')-V'<pyd(r — r’)] 
= VV": {H(r) + A ()]Cpo(r — &’)} (8.2.51) 
= 27V": [D(r)<p) dor — r’)). 


However, this result is not as general as (8.2.43) since it is valid in the stationary 
state only, nor is it the same in the stationary state because it includes <p), not 
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p(r). However, since the Fokker-Planck formalism 1s valid only as a large cell size 
limit in which fluctuations are small, to this accuracy (8.2.51) agrees with (8.2.43) 


8.2.3 Reactions and Diffusion Combined 


We introduce reactions by assuming that molecules within a cell react with each 
other according to a Master equation like those of Chap. 7. We wish to consider 
several chemical components so we introduce the notation 


X = (x,, x», ...) (8.2.52) 


where x, represents a vector whose components x,,, are the numbers of molecules 
of species X, in cell 7. Thus, we write 


a,P(X, t) = 8,P(diffusion) + 31 (5 the — P(e, — v4 Xx) 
+ t4(x, + r4)P(x, + 14, X) — [t4(e) + 13(x)IPCX)} (8.2.53) 


where the diffusion part has the form of (8.2.3), but 1s summed over the various 
components. 

This leads via a Kramers-Moyal expansion to a Fokker-Planck equation, with 
the usual drift and diffusion as given by (7.5.32). 

We can write equivalent stochastic partial differential equations in terms of a 
spatially dependent Wiener process W(r, t) as follows: we consider an isotropic 
constant diffusion tensor 


D(r) = D1 (8.2.54) 
dp, ={DVpe + Sirk«s TI pee — ez TL pwa\ldt + dw.(r, t) (8.2.55) 
A a a 
with 


dW(r, t)dWile', t) = (25,,a7' -PLDaplr)8(r — 1’) 
+ 8 =!) So rérses I pa* + 3 I pe} (8.2.56) 


The validity of the Langevin equation depends on the system size expansion. 
Equations (8.2.55,56) depend on the particular scaling of the chemical rate constants 
with Q, given in Sect. 7.5.3 (7.5.29). The only interpretation of Q which is valid in 
this case is 


Q = PF = volume of cell. 


Notice, however, that at the same time, the diffusion part scales like / since /?d must 
remain equal to the diffusion coefficient while the terms arising from chemical 
reactions scale like /*. This means that as cell volume is increased, we have less and 
less effect from diffusion, but still more than the correction terms in the chemical 
part which will be integral powers of /? less than the first. 
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The precise method of comparing diffusion with reaction will be dealt with 
later. 


Example: X,+=>2X,: for this chemical reaction, we find (using the methods of 
Sect. 7.5) 


n=} fh Hf we 


Thus, substituting 


dp,(r) =a (D,V’ p, =—Kypior K2p3)dt + dW\(r, t) 


(8.2.58) 
dp(r) = (DV"p, + 2«,p, — 2x2,p})dt + dW,/r, ft) 


where dW(r, t)dW(r', t) = 
2V -V'[Dip,d(r—r'’)] + 8(r—r' kip: + 293], —26(r—r')kip, + £23] 


—26(r—r')[k,p, + 2p3], 27 -V'[D.p.d(r—r’)] + 40(r—r'’)[K, pi; + K2p2] 
(8.2.59) 


This equation is valid only as a system size expansion in Q7"', that is, the cell size, 
and the continuum limit is to be regarded as a notation in which it is understood that 
we really mean a cell model, and are working on a sufficiently large scale for the 
cell size to appear small, though thé cell itself is big enough to admit many mole- 
cules. 

Thus, this kind of equation is really only valid as a linearised equation about 
the deterministic state which is the form in which Keizer [8.2] has formulated 
chemical reactions. In this respect, the Poisson representation Is better since it gives 
equations exactly equivalent to the Master equation. 


8.2.4 Poisson Representation Methods 


Corresponding to a reaction with no more than bimolecular steps, we have from 
(7.7.9) a rather simplified Fokker-Planck equation since for the spatial diffusion 
[using the formulation (8.2.2)], we find the diffusion matrix vanishes. The genera- 
lisation to a spatially dependent system is then carried out in the density variable 


nAr) = a,(r)/P (8.2.60) 


and we find 


ddr) = (D,P nde) +S rd0ck TI ne* — ea TL mide + dir, 1) (8.2.61) 


dW fr, thdWa(r', t) = dt 8(r—r') Doe’ TT me® — 19 TL 9) 


xX (M4M¢ — NING — Ort). (8.2.62) 
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These equations are very similar to (8.2.55,56). When explicitly written out they 
are simpler. For example, considering again X, => 2 X2, we get 


dn (r) = (DV?n, — Kin, + K2n})dt + dW,(r, t) (8.2.63) 

dn(r) = (DV? + 2kyy, — 2K2n})dt + dW,fr, t) (8.2.64) 
0 O 

dWir, thdW'(r', t) = i 3 (KM, — K2n3)d(r — r’)dt . (8.2.65) 


The simplicity of (8.2.63-65) when compared to their counterparts (8.2.57, 58) 
is quite striking, and it is especially noteworthy that they are exactly equivalent 
(in a continuum formulation) to the Master equation. 


8.3. Spatial and Temporal Correlation Structures 


We want to consider here various aspects of spatial, temporal and spatio-temporal 
correlations in linear systems, which are of course all exactly soluble. The correla- 
tions that are important are the factorial correlations which are defined in terms of 
factorial moments in the same way as ordinary correlations are defined in terms 
of moments. The equations which arise are written much more naturally in terms 
of factorial moments, as we shall see in the next few examples. 


k 
8.3.1 Reaction Y=—= Y 
ke 


We assume homogenous isotropic diffusion with the same diffusion constant for 

X and Y, and since both the reaction and the diffusion are linear we find Poisson 

representation Langevin equations for the concentration variables n, u (correspond- 

ing, respectively, to X and Y) with no stochastic source, 1.e., 
an(r, t) = DV’?n—kiyn + kop 


(8.3.1) 
Our, t) = DV*u + kin — kyp. 


a) Spatial Correlations 
We now note that 


Cnr, t)> = <p.lr, t)> 

Cur, t)> = <p,(r, t)) (8.3.2) 
<n(r,t), nr’, t)> = <p.(r,t), pre’, t)> — 6(r — v’)<p,(r, t)> = ar, r, t) 
Cn(r,t), U(r’, t)> = <p.(r,t), pyle’, t)> = f(r, v’, t) 

Ku(r,t), Ur’, t)> = <py(r,t), ple’, 1) — Or — r')<p,(r, t)> = hte, r,t), 


which are all continuum notation versions of the fact that the Poissonian moments 
are equal to the factorial moments of the actual numbers. 
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The equations for the mean concentrations are obviously exactly the same as 
(8.3.1). Assuming now a homogeneous situation, so that we can assume 


<pxlr, t)> = <p.(t)> 
<p, (rr, t)> — <py(t)> 


g(r, r’, t) = g(r — Fr’, t) (8.3.3) 
f(r, r,0=fr—r,t) 
hr, r,t) = h(r — rv’, t) 


and compute equations of motion for <v(r, t)n(0, t)>, etc, we quickly find 


g(r, t) _ 2DP72(r, t) — 2k, 2(r, t) ae 2k. f(r, t) 


ot 
WED) _ app2fir, t) — (ky + kadfle, 1) + Kahle, 1) + kigte, 1) (8.3.4) 
one 2DV*h(r, t) — 2k,h(r, t) + 2k, f(r, t), 


The stationary solution of these equations has the form 


g(r) = ¢ki, f(r) = Chika, Ar) = Oki (8.3.5) 


where € is an arbitrary parameter. The corresponding stationary solutions for 
the means are : 


(x(r;)) = Ak, (Wri) = Ak, (8.3.6) 


where 4 is another arbitrary parameter. If € = 0, we recover the Poissonian situa- 
tion where 


<px(r), px(r')> = <p.(r)> (r — vr’) 
<p,(r), py(r’)> = <py(r)> Sr — rv’) (8.3.7) 
(pr), py(r’)> = 0. 


(By choosing other values of A, different solutions corresponding to various distri- 
butions over the total number of molecules in the system are obtained). 

Time-dependent solutions for any initial condition can easily be developed. 
In the case where the solutions are initially homogeneous, uncorrelated and Pois- 
Sonian, (8.3.7) is satisfied as an initial condition and thus f, g, and # are initially 
all zero, and will remain so. Thus, an uncorrelated Poissonian form is preserved in 
time, as has already been deduced in Sect. 7.7b. 

The problem of relaxation to the Poisson is best dealt with by assuming a speci- 
fic form for the initial correlation function For example, an initially uncorrelated 
but non-Poissonian system represented by 


g(r, 0) = a &(r), f(r, 0) = B (nr), h(r, 0) = y &(r). (8.3.8) 


Time-dependent solutions are 


g(r, t) k3e, — 2k e,e° Kit ko)! 4 gge7 2k tke) 
I(r, th) = er k,k2€, + (k, — k,)ege~ *142* — €,e7 21+ koe (8.3.9) 
A(r, t) kee, + 2k,e,e7° “it ky! 4 gge7 2 t kat 
where 
€. = (a@ + 2B + y\/(ki + ka)? 
én = [kB + y) — kila + Byki + ka)? (8.3.10) 


€3 = [kia + kiy — 2k, k,B\(k, + k2)’. 


Comments 

i) The terms ¢,, €,, and e, correspond, respectively, to deviations from an uncorre- 
lated Poissonian of the quantities ¢(x; + y,), (x; + y)))>s Cr + Wy) (iy, — k2x,)), 
and <(kiy,; — k2x;), (kiy,; — k2x,)>, which are essentially density fluctuations, 
correlation between density fluctuation and chemical imbalance, and fluctuations 
in chemical imbalance. We notice a characteristic diffusion form multiplying a 
chemical time dependence appropriate to the respective terms. 

11) The time taken for the deviation from a Poissonian uncorrelated form given by 
(8.3.8) to become negligible compared to the Poissonian depends, of course, on the 
magnitude of the initial deviation. Assuming, however, that a, B, y, <x,>, and 
<y, are all of comparable size, one can make a rough estimate as follows. We 
consider a small spherical volume of radius R much larger, however, than our 
basic cells. Then in this small volume V, we find that 


var {x[V, O]} = f d°r f d°r'(p,(r, 0), p,(r’, 0), i.e., 


var {x[V, 0]} = <x[V, O]) + aV (8.3.11) 
and similarly, 

var {y[V, O]} = <ylV, O)) + yV 

CxLV, 0], vLV, O]) = BV 
while after a time t > R?/4D, these quantities satisfy approximately 

var {x[V, t]} = <x[V, t]> 


2 
= a (kie, — 2kyE,e° “142! 4. ge7 21+ k2)0) 


var {y[V, t]} = <ylV, t)> 


2 
F apart: +- 2k ,e,e7 +k) + £,e7 2(k1+k2)*) (8.3.12) 
V2 
Cx[V, t], WV, t)) =a (8nDt)?!? 


x [kikxe, + (kz — ky ere” “1% — @,e- 2th] 
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Thus, the diffusion has reduced the overall deviation from Poissonian uncorrelated 
behaviour by a factor of the order of magnitude of R?/(Dt)?/2. However, notice 
that in the case of an initial non-Poissonian, but also uncorrelated, situation, 
corresponding to B = 0, we find that a correlation has appeared between X and Y, 
which if the chemical rate constants are sufficiently large, can be quite substantial. 


b) Space-Time Correlations 
Since the equations of motion here are linear, we may use the linear theory devel- 
oped in Sect. 3.7.4. Define the stationary two-time correlation matrix G(r, t) by 


) » Mx 0, s y\" 9 9 BPx\M> 
Gr, 1) = bie t), p,(0, 0)) Cpr, t), px(0 a | (8.3.13) 
CpxAlr, t), p,(0,0)>, <p, (r, t), p,(0, 0) 
Then the equation corresponding to (3.7.63) is 
eC die G(r, t 8.3.14 
Ge.n=| a 5, | 9 ). (8. , ) 


The solution can be obtained by Fourier transforming and solving the resultant 
Ist order differential matrix equation by standard methods, with boundary condi- 
tions at tf = 0 given by (3.3.7). The result is 


_ (“per + <py>\ exp (—r?/4Dt) 
or 4) =( (k, +k, (4nDi)>? 


2 —(k}+k9)t — pa-tky +k) e 
k3 + k,k,e° “ithe ‘ k,k,(1 — e7 “tk2)*) (8.3.15) 
k,k.(1 — eW 1 +k2)#) k?2 + k,k.e7 &itk2)+ 
If we define variables 
n(r, t) = p,(r, t) + p,(r, t) (8.3.16) 
e(r, t) = [kip.(r, t) — k2p,(r, t)\/(ki +k), (8.3.17) 
the solution (8.3.15) can be written as 
_ exp (—r?/4Dt) 
(nr, t), n(0, 0), = {n). (4nDt)>!? (8.3.18) 
<n(r, t), c(0, 0)>, = <c(r, £), n(0, 0)>, = 0 (8.3.19) 
ape 
<e(r, t), c(0, 0), = Aiki<n, exp(—r*/4Dt) en (ky tha)t (8.3.20) 


(ki +k (4nbty? 


The variables n and c correspond to total density and chemical imbalance density, 
l.e., <c>, = 0. Thus we see that (8.3.18) gives the correlation for density fluctua- 
tions which is the same as that arising from pure diffusion: (8.3.20) gives the 
correlation of fluctuations in chemical imbalance, and (8.3.19) shows that these are 
independent. A characteristic diffusion term multiplies all of these. The simplicity 
of this result depends on the identity of diffusion constants for the different species. 
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k 
8.3.2 Reactions B+ Y==C, A+ X—— 2x 
k3 2 


This reaction has already been treated in Sect. 7.6.4a without spatial dependence. 
We find the Poisson representation equation for the concentration variable 7(r, t) 
is [from (8.2.60-62), (7.6.55,56)] 


dn(r, t) = [DV'*n(r, t) + (K. — Ki)n(r, t) + K3]dt + dW(r, t) (8.3.21) 

dW(r, t)\dW(r', t) = 2dt &(r — r’)K2n(r, t) , (8.3.22) 
where 

k3C a K,I° 

k,B= K, (8.2.23) 

k,A =— K, 5 


This system can be solved exactly since it is linear, but since the second reaction 
destroys X at a rate proportional to X, we do obtain a noise term in the Poisson 
representation. 

[In the Kramers-Moyal method, we would find an equation of the same form 
as (8.3.21) for p(r, t), but the dW(r, t) would satisfy 


dW(r, t) dW(r', t) = dt {2v'-V[Dp(r)d(r—r')] 
+ &(r—r')[(K, + Ki)n(r, t) + Ks)}). (8.3.24) 


a) Spatial Correlations 
Define now 


g(r, r,t) = <plr, t), ple’, t)> — Or — r’)<plr, t)> 
= <nlr, t), nr’, t)> = Cnr, One’, t)> — Cnr, E> <n, t)> (8.3.25) 
We consider the stationary homogeneous situation in which clearly 


Cnr, t)>, = (alr, t))>s = (Ps - (8.3.26) 
Then, 


dg(r, r’, t) = dnr, thr’, t)) 
= Cnr, t)dy(r’, t)> + <dnlr, t)n(r’ t)> + <dn(r, t)dn(r’, t)> (8.3.27) 


and using the usual Ito rules and (8.3.21,22), 


= [Dy? + DV” + 2(K, — K,)l<n(r, t)n(r’, t)> 
+ 2K3<p),dt + 2K,d(r — r’')<p)>,dt . (8.3.28) 


Note that 
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CPs = K3/(K, ~~ K,) ’ (8.3.29) 


and that in a spatially homogeneous situation, g(r, r’, t) can only be a function of 
r — r’, which we call g(r, f). 
Substitute using (8.3.25,26) to obtain 


0,g(r, t) = 2[DV* + (Ki — Ki)lg(r, t) + 2K2<p),8(r) - (8.3.30) 


The stationary solution g,(r) is best obtained by representing it as a Fourier inte- 
gral: 


g(r) = { d°qge"*"2,(q) (8.3.31) 


which, on using 


5(r) = (2x)? [ d*ge"r’, (8.3.32) 
gives 
&.(q) = eee eee (8.3.33) 


(2x)? Dq* + K, — Kz 


whose inverse is given by 


g,(r) = A260» exp [—r (| “| (8.3.34) 


Hence, 


<p(r, t), p(t’, t)>; = (r — r’)<p>s 


K24P)s | ow - = A) 
+ Dips exp jr—r'| D . (8.3.35) 


Comments 

1) We note two distinct parts: a 6 correlated part which corresponds to Poissonian 
fluctuations, independent at different space points, and added to this, a correlation 
term with a characteristic correlation length 


I, = ./DKK,— KK). (8.3.36) 


11) Approach to equilibrium: when K,— 0, we approach a simple reversible reac- 
tion B + X = C; one sees that the correlation in (8.3.34) becomes zero. However, 
the correlation length itself does not vanish. 

il) Local and global fluctuations: a question raised originally by Nicolis [8.3] is 
the following. Consider the total number of molecules in a volume V: 


x[V, t] = f dr pcr, t). (8.3.37) 


Then we would like to know what is the variance of this number. We can easily see 


8.3 Spatial and Temporal Correlation Structures 321 
var {x(V)}, = f dr f drole, 1), alr’, 1). (8.3.38) 
V V 
and using (8.3.35), 
= «x(V)>, + Bape f d°r { d?r'|r — e'|~' exp (— |r — r'|/1) (8.3.39) 
ty y Vv 
We can compute two limits to this. If V is much larger than /°, then, noting that 


g(r) —- 0 as r— oo, we integrate the stationary version of (8.3.30) and drop the 
surface terms arising from integrating the Laplacian by parts: 


0 = UK, — Ki) f dr f Pr'g(e — #') + 2Kakp).f dH. (8.3.40) 
Thus, 

Jf dr dr g(r — r') ~ oe (8.3.41) 
so that 

var{x(V)}, ~ Us vs py. (8.3.42) 


K, — K, 


The small volume limit is somewhat more tricky. However, for a spherical volume 
of radius R </., we can neglect the exponential in (8.3.39) and 


R 
ff d3r a?’ |r — |“! = f d?r’ [ 4ar2dr d(cos0)(r? + r’? — 2rr'cos6)-"!? 
VV V 8) 


= fae [arr tr —|r—r'l) (8.3.43) 
= 2R5(4n)*/5 
so that 
var{x(V)}, ~ ¢x(V)), (1 +3 | (v => nr < r). (8.3.44) 


Hence, we See in a small volume that the fluctuations are Poissonian, but in a large 
volume the fluctuations are non-Poissonian, and in fact the variance 1s exactly 
(7.6.72) given by the global Master equation for the same reaction. 

In fact, for an arbitrary spherical volume, the variance can be evaluated directly 
(by use of Fourier transform methods) and is given by 


var {x(V)} = <x(V))> 1 at seus 1 a” fal = 3(7) | 


= ene a 7) ‘| (8.3.45) 


and this gives the more precise large volume asymptotic form 
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K, 3K, I, 


var (x(V)} ~ &x(V)) (oe “see Ue: (8.3.46) 


The result can also be illustrated by noting that the Fourier representation of 
<p(r, t), p(r’, t)> is given by adding those of g,(r) and d(r)<p>, and is clearly 


Dq’ + Ki 


—3 

<p»,(21) Dq? + K, - Ky (8.3.47) 
This means that for small q, i.e., long wavelength, this is approximately 

<p) «(2m)? —*! (8.3.48) 

: K, — K, 

which is the Fourier transform of 

<p» a o(r — r’) (8.3.49) 

Pos Ky — K, si 

which corresponds to the same variance as the global Master equation. 

For large q, 1.e., small wavelengths, we fine 

<p) (2n)"? (8.3.50) 
corresponding to 

(p),o(r — vr’), : (8.3.51) 


i.e., Poissonian fluctuations. Physically, the difference arises from the different 
scaling of the diffusion and reaction parts of the master equation which was noted 
in Sect. 8.2.3. Thus, in a small volume the diffusion dominates, since the fluctuations 
arising from diffusion come about because the molecules are jumping back and 
forth across the boundary of V. This is a surface area effect which becomes rela- 
tively larger than the chemical reaction fluctuations, which arise from the bulk 
reaction, and are proportional to volume. Conversely, for larger V, we find the 
surface effect is negligible and only the bulk effect is important. 


b) Space-Time Correlations 
Since the system is linear, we can, as in Sect. 8.3.1, use the method of Sect. 3.7.4 
Define 

G(r, t) = <plr, t), pO, O>), - (8.3.52) 
Then the linear equation corresponding to (3.7.63) is 

0,G(r, t) = DV’G(r, t) — (K, — K,)G(r, t) (8.3.53) 


with an initial condition given by 


G(r, 0) = <p(r), p(0)>, (8.3.54) 
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which is given by (8.3.35). Representing G(r, t) as a Fourier integral 

G(r, t) = { d°qe-*" Gq, t), (8.3.55) 
(8.3.53) becomes 

0,G(q, t) = —(Dq? + K, — K2)G(q, t) (8.3.56) 


whose solution is, utilising the Fourier representation (8.3.47) of the initial 
condition (8.3.54), 


x = Dq* + K, 
ee 3 —_ 2 oan 
G(q, t) = (2m) Dg? + K, —K, exp[—(Dq? + K,— K,)t]. (8.3.57) 
If desired, this can be inverted by quite standard means though, in fact, the 
Fourier transform correlation function is often what is desired in practical cases 
and being usually easier to compute, it is favoured. 
Thus, we have 


<p> exp(—r?/4Dt — Dt/I2) 
(4nDt)*/? 


Fee? lexp(—rl )] erfe hoe - DN 


G(r, t) = 


+ 


+ [exp(r//,)] erfc oo i t- pan ; (8.3.58) 


For small t we find 


4K,(Dt)?”? K ee 
G(r, t) — <p> exp Fra ) [can Dt)-3/* — ee + Raspes ; (8.3.59) 
while for large tf, 
exp(—r?/4Dt K,l, 


c) Behaviour at the Instability Point 


As K, — K,, the reaction approaches instability, and when K, = K;, there are no 
longer any stationary solutions. We see that simultaneously, 


1) the correlation length J, —-- co (8.3.36); 

ll) the variance of the fluctuations ina volume V > [? will become infinite [(8.3.42)]. 
However, as /, —- oo at finite V, one reaches a point at which /? ~ Vand (8.3.42) 
is no longer valid. Eventually /? > V; the volume now appears small and the 
fluctuations within it take the Poissonian form; 

111) the correlation time is best taken to be 


t, = 2/D (8.3.61) 
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being the coefficient of ¢ in the exponent of the long-time behaviour of the space 
time correlation function (8.3.60) (ignoring the diffusive part). Thus, t, — co also, 
near the instability. We thus have a picture of long-range correlated slow fluctua- 
tions. This behaviour is characteristic of many similar situations. 


8.3.3 A Nonlinear Model with a Second-Order Phase Transition 


We consider the reactions 


“ (8.3.62) 


considered previously in Sects. 7.7c, 7.7.3a, 7.7.4c. This model was first introduced 
by Schidégl [8.4] and has since been treated by many others. 
The deterministic equation 1s 


0,p(r, t) = DV*p +3 4+ (kK. — K))p — Kap’ (8.3.63) 
whose stationary solution is given by¥ 

p(r) = p, = [k. — Ky + W(x, — K))* + 4k 4k3]/2K, . (8.3.64) 
[The x’s are as defined by (7.5.29), with Q = 1°]. For small x3, this stationary 
solution represents a transition behaviour as illustrated in Fig. 8.1. 


The appropriate stochastic differential equation in the Poisson representation 
IS 


x) 


Fig. 8.1. Plot of <x> vs kK, 
for the second-order phase 
transition model 
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An(r, t) = [((Dn\r, t) + Ks + (kK. — Ki )y(r, 1) — Kan’(r, t)]dt (8.3.65) 
+f 2 [Kan(r, t) — Kan’(r, t)I'?'dW(r, t), _ 

where 
dW(r, t)dW(r’, t) = o(r — r’')dt . (8.3.66) 


Here we have not taken the continuum limit developed in Sect. 8.2.1 but have pre- 
served a possible nonlocality of diffusion, 1.e., 


(Dny(r, t) = — § D(lr—e' |)’, td?’ (8.3.67) 


but we have considered only homogeneous isotropic diffusion. 

In this form, there is no small parameter multiplying the noise term dW(r, t) 
and we are left without any obvious expansion parameter. We formally introduce a 
parameter J in (8.3.65) as 


dn(r, t) = ((Dn\r, t) + 3 + (2 — Ky )y(r, t) — Kan’(r, C)]dt 
+ ASD [kan(r, t) — Kn(r, t))!2dWr, t) (8.3.68) 


and expand n(r, t) in powers of A to 
n(r, t) = nor, t) + An(r, t) + A?nQ(r, t) + ... | (8.3.69) 


and set A equal to one at the end of the calculation. However, if it is understood 
that all Fourier variable integrals have a cutoff /~', this will still be in fact a (/?)7! 
expansion. 

Substituting (8.3.69) in (8.3.65), we get 


ne t) = (Dn \(r, t) + K3 + (kK. — K)yolr, t) — Kan(r, t) (8.3.70) 


dy (r,t) = (Dm \r, t) + [k. — Ky — 2Kandlr, t)|m(r, 0} at 
+ of 2 [kano(r, t) — Kani(r, t)]'/*dW(r, t) (8.3.71) 


dn lr, t) = {((Dn.)r, t) + [k,. — kK, — 2K,4nd(r, t)|qo(r, t) — Kani(r, t)} dt 


[x. — 2xand(r, Om (r, t) 


+ 7 7T leanolr, t) — ean(r, O) 


dW(r, t). (8.3.72) 


Equation (8.3.70) has a homogeneous steady-state solution (8.3.64) so that 
Kz — Ki — 2K 4No =k= [(x, a K,)” -+- 4x,k,]'/? . (8.3.73) 


Substituting this in (8.3.70-72) and taking the Fourier transforms, we get 


m(q, t) = (2x\N)'!? j [exp {—[D(q?) + x(t — t’)} ]dW(q, ¢’) (8.3.74) 
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Hq, t) = —K4 f d*q, f dt'[exp{—L7 (9?) + Kt —t')} Aga — at Hig, t’) 
. a (8.3.75) 
Ky K 3 2 / ro-4 / 1 
= ean — we! a°q J texp{—1 Bq )+ K\(t—e')} iq —9.t)dW(q, t) 


where 


dW(q, t) = (5) " f exp(—ig-dW(r, t)d?r (8.3.76) 
and 
l 3/2 : 
flat) =(s2) fexp (—ig-rim(r, ae , (8.3.77) 


etc., and D(qg’) is the Fourier transform (|r — r’|). We have left out the trivial 
initial value terms in (8.3.74, 75). 

To the lowest order, the mean concentration and the correlation function are 
given by (in the stationary state) 


<plr, t)> = No (8.3.78) 
alr, t)p(r’, t)> — <plr, t)>< p(t’, t)> = nod(r — rv’) 
+ <m(r, t)m(e', t) - (8.3.79) 
From (8.3.74) it follows that 
rf} 
: er ayn KiNG + 1) py - 2 
M(@ Ong, t)> = 19@) +x] [1 — exp {—2[M(q’) + x]t}]. (8.3.80) 


Hence, the lowest-order contributions to the correlation function in the steady 
state are given by 


<p(r)p(r’) — <p(r)><p(r')> = mdr + 6’) 


KiMo ¢ 73, explig-(r — r’)) 
tome! aayae (8.3.81) 


If we assume that 


(Dn\(r) = DV*n(r) , (8.3.82) 
then 
J(q°) = Dq’ . (8.3.83) 


Equation (8.3.81) then has exactly the same form as derived previously in 
(8.3.35), namely, 


Cpe), ale)>s = <p>aBlr = 1) + gepeh exp(—|r—r' Ill) (8.3.84) 
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where 
L=(Din)"'* (8.3.85) 


All the same conclusions will follow concerning local and global fluctuations, 
Poissonian and non-Poissonian fluctuations, since these are all consequences of the 
form (8.3.84) and not of the particular values of the parameters chosen. 

Notice that if «, = K2, aS K;-+ 0 so does x — 0, and hence /, —- oo, and the 
same long-range correlation phenomena occur here as in that example. 


Higher-Order Corrections: Divergence Problems. These are only lowest-order results, 
which can be obtained equally well by a system size expansion, Kramers-Moyal 
method, or by factorisation assumptions on correlation function equations. 

The next order correction to the mean concentration Is <n,(r, t)). Now from 
(8.3.75) 


“aq, t)) = — f d°g, far exp{—[G(q*) + x(t — £°)} 
x ma — a, mh, t')> - (8.3.86) 


In the steady state (8.3.86) gives 


<H2(Q)> = —K4k1900(Q) f ieee | 


aye (8.3.87) 


For the choice of @ given by (8.3.83,87) gives 


_ __ K4kiMo d*q, 
<n) pie (21)3/? j Dq? 4+ (8.3.88) 


and this integral is divergent. The problem lies in the continuum form chosen, for 
if one preserves the discrete cells at all stages, one has two modifications: 

i) { d?q is a sum over all allowed g which have a maximum value |q| ~ 1/l. 

ii) D(q’) is some trigonometric function of g and /. If 


d;; =O (i not adjacent to /) 


= d otherwise, 


then ®(q*) has the form 


7 (si? (2) 4 sin? (2) + sin? (=) |. 


In this form, no divergence arises since the sum is finite and for 


ql <1, 
Hq?) — Dq?. 


5L8 8. Spatially Distributed Systems 


Nonlocal Reactions. The divergence, however, can also be avoided by preserving the 
continuum but modifying the 77(r) term in the equations. This term represents a 
collision picture of chemical reactions and will be nonlocal for a variety of reasons. 
i) Molecules have a finite size, so a strictly local form ignores this. 

il) The time scale on which the system is Markovian is such that molecules will have 
travelled a certain distance in this time; the products would be the result of 
encounters at a variety of previous positions, and would be produced at yet 
another position. By making the replacement 


n(r)? — f d’r'd?r' mr — rir — rv’ )y(r')(r’) , | (8.3.89) 


one finds that instead of (8.3.88) one has 


_ __ Kati » d°q MQ, —4) 
(n> = (ny J Dark (8.3.90) 


where m(q, q’) is the fourier transform of m(r, r’). If m is sufficiently nonlocal, at 
high q, m(q, —q) will decay and <7,» will be finite. 


8.4 Connection Between Local and Global Descriptions 


We Saw in Sect. 8.3.2 that the variance of the fluctuations ina volume element V is 
Poissonian for small V and for sufficiently large V, approaches that corresponding 
to the master equation without diffusion for the corresponding reaction. This arises 
because the reactions give rise to fluctuations which add to each other whereas 
diffusion, as a mere transfer of molecules from one place to another, has an effect 
on the fluctuations in a volume which ts effective only on the surface of the volume. 

There is a precise theorem which expresses the fact that if diffusion is fast 
enough, the molecules in a volume V will travel around rapidly and meet each 
other very frequently. Hence, any molecule will be equally likely to meet any other 
molecule. The results are summarised by Arnold [8.5] and have been proved quite 
rigorously. 

Basically, the method of proof depends on adiabatic elimination techniques as 
developed in Chap. 5 and can be easily demonstrated by Poisson representation 
methods. 


8.4.1 Explicit Adiabatic Elimination of Inhomogeneous Modes 
We suppose we are dealing with a single chemical species which diffuses and reacts 


according to the cell model and thus has a Poisson representation Fokker-Planck 
equation: 


er = [s E i Dyx,] +> Es aaa 2 o(x)]| P. (8.4.1) 


We introduce the eigenvectors f,(q) of D,,, which satisfy 
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2 Di f(QM = —DAQSFAW) - (8.4.2) 


The precise nature of the eigenfunctions is not very important and will depend on 
D;,. For the simplest form, with transfer only between adjacent cells and with 
reflecting boundaries at the walls of the total system, assumed to be one dimen- 
sional and of length L = N/ (with / the cell length), one has 


Fi(q) x cos (qi) (8.4.3) 


and the reflecting boundary condition requires that g has the form 


nvm 


and 
Gy asin (| P, (8.4.5) 


Appropriate modifications must be made to take care of more dimensions, but the 
basic structure 1s the same, namely, 


Ane (8.4.6) 
A(q) > 0 (q # 0) and 
f,(0) = constant = N7!/2, (8.4.7) 


This last result represents the homogeneous stationary state of diffusion with the 
normalisation N'!? fixed by the completeness and orthogonality relations 


DADA) = Oa.q! 5) 


(8.4.8) 
TA DA|) = 44) 
We now introduce the variables 
x(q) = LAD, ; (8.4.9) 
The variable of interest is proportional to x(0), since 
x(0) = NU? 2 x, (8.4.10) 


iS proportional to the total amount of substance in the system. The other variables 
are to be adiabatically eliminated. In anticipation of an appropriate choice of 
Operators L,, L,, L3;, we define 
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¥gQ=xq)/D (¢#09) (8.4.11) 


x = x(0)/./ N = N7 2 Nis (8.4.12) 


The various terms in the FPE can now be written 


Se ae 2 A(4) i y(q). (8.4.13) 
Define now 
= DAQDI@) (8.4.14) 
then 


ER = Weel + 7B 
+ ./D Saw Lhia(x +> ne », (8.4.15) 


oF heey ase oe b( i | 
2193 PO) = Na Bg + 75” 
+HVD a Seay tine (x + 5 ‘ 
0? 
pos 06 = ak 8.4.16 
We now write this in decreasing powers of ./D. Thus, define 


DL, = DS Aa) 5m wa) + gers HC) (8.4.17) 


which is the coefficient of D in an expansion of the terms above (lower-order terms 
are absorbed in L,). We also set 


», za eb (x i wate) ) (8.4.18) 


where the average is over the stationary distribution of L,. As D becomes large, 


Li = (4 5 ‘ +R 


_ FS ax) +45 $60) | me (5) | (8.4.19) 


We define L, to order ./D which can be computed to involve only terms in which 
d/dy(q) stands to the left. Thus, in carrying out an adiabatic elimination of the 
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y(q) variables as D — oo to lowest order, there will be no PL,L;'L, contribution 
and we ultimately find the FPE 


oP 7) 1 0 
OP =| F atx) + 3 S5000)| P (8.4.20) 


which corresponds to the global Master equation, since x is by (8.4.12) a concen- 
tration. Notice that the condition for validity is that 
DA) > K (8.4.21) 


where K represents a typical rate of the chemical reaction, or noticing that 
A(1) = (n/NP), 


N2pP 
D> an (8.4.22) 
or 
fi 5 > = | (8.4.23) 


The left-hand side, by definition of diffusion, represents the root mean square 
distance travelled in the time scale of the chemical reaction, and the inequality 
says this must be very much bigger than the length of the system. Thus, diffusion 
must be able to homogenise the system. In [8.9] this result has been extended to 
include homogenisation on an arbitrary scale. 


8.5 Phase-Space Master Equation 


The remainder of this chapter deals with a stochastic version of kinetic theory 
and draws on the works of van Kampen [8.6] and van den Broek et al. [8.7]. 

We consider a gas in which molecules are described in terms of their positions 
r and velocities v. The molecules move about and collide with each other. When a 
collision between two molecules occurs, their velocities are altered by this collision 
but their positions are not changed. However, between the collisions, the particles 
move freely with constant velocity. 

Thus, there are two processes—collision and flow. Each of these can be handled 
quite easily in the absence of the other, but combining them is not straightforward. 


8.5.1 Treatment of Flow 


We suppose there is a large number of particles of mass m with coordinates r, and 
velocities v,, which do not collide but move under the influence of an external 
force field mA(r). The equations of motion are 


r, = U, 


(8.5.1) 
v, = A(r,) . 


Then a phase-space density can be defined as 
f(r, v, t) = D7 6(r — £,(t))6(v — v,(t)) (8.5.2) 
so that 


Of(r, UV, t) = di lPa VO(r — r,(t))d(v — v,(t)) + S(r — 7, (t))0,- 0 ,d(v — v,(t))] 


and using the properties of the delta function and the equations of motion (8.5.1), 
we get 


(8.5.3) 


This is a deterministic flow equation for a phase-space density. If the particles are 
distributed according to some initial probability distribution in position and velocity, 
(8.5.3) is unaltered but f(r, v, t) is to be interpreted as the average of f(r, v, t) as de- 
fined in (8.5.2) over the initial positions and velocities. 

Equation (8.5.3) is exact. The variable f(r, v, t) can be regarded as a random 
variable whose time development Is given by this equation, which can be regarded as 
a stochastic differential equation with zero noise. The number of particles in a 
phase cell, i.e., a six-dimensional volume element of volume 4, centred on the 
phase-space coordinate (r;, U;) 1s, of course, 


X(r,, UV, = [ dr du f(r, v). (8.5.4) 
Ai 


8.5.2 Flow as a Birth-Death Process 


For the purpose of compatibility with collisions, represented by a birth-death 
Master equation, it would be very helpful to be able to represent flow as a birth- 
death process in the cells 4;. This cannot be done for arbitrary cell size but, in the 
limit of vanishingly small cell size, there is a birth-death representation of any flow 
process. 

To illustrate the point, consider a density p(r, t) in a one-dimensional system 
which obeys the flow equation 


0,p(r, t) = — kd,p(r, t). (8.5.5) 
This deterministic equation is the limit of a discrete equation for x,, defined as 
x(t) = J dr p(r, t) = Ap(r;, t) (8.5.6) 


where 4, is an interval of length J around a cell point at r,. The flow equation is 
then the limit as 4 — 0 of 


Ox(t) = = bat) — xO), (8.5.7) 


1.€., 
10,p(r,, t) = 7 [Ap(r, — 4, t) — Ap(r, t)] (8.5.8) 


whose limit as A — 0 is the flow equation (8.5.5). A stochastic version is found by 
considering particles jumping from cell i to cell i + 1 with transition probabilities 
per unit time: 


i (x) = Kx 4 
t-(x)=0. 


(8.5.9) 


This is of the form studied in Sect. 7.5, whose notation (7.5.4, 5) takes the form 
here of 5 


A—i 
Ni— 6,, (8.5.10) 
Mi — Orsi. 


J 


ri — O7,t41 ts 


We consider the Kramers-Moyal expansion and show that in the limit A — 0, all 
derivate moments except those of first order vanish. The drift coefficient is given 
by (7.5.32), 1.e., 


A(x) = pa (Oa.t41 — Oa, )KX,/A 


a Lee oe, (8.5.11) 
and the diffusion matrix 
Ba #) = [6.8 — 5a-1,2)¥a-1 + (as — 5a.-1)%e)) - (8.5.12) 
We now set x, = p(r;,) aS in (8.5.6) and take a small A limit; one finds 
A,(x) —~ = Dplre — 4,1) — Aplres 1) (8.5.13) 
— — A1kd,p(r, t) (8.5.14) 
and the limiting value of B, ,(x) is found similarly, but also using 
Oa.b rat A o(r, rai rs) (8.5.15) 
B,,.(x) — «A30,0,,[5(r — r’)p(r)] . (8.5.16) 


Thus, in this limit, p(r, t) obeys a stochastic differential equation 


dp(r, t) = — k0,p(r, t)dt + A'*dW(r, t) (8.5.17) 
where 
dW(r, t)dW(r', t) = « dtd,0,,[6(r — r’)p(r, t)]. (8.5.18) 


We see that in the limit A — 0, the noise term in (8.5.17) vanishes, leading to the 
deterministic result as predicted. 

It is interesting to ask why this deterministic limit occurs. It is not a system size 
expansion in the usual sense of the word, since neither the numbers of particles x, 
nor the transition probabilities t*(x) become large in the small A limit. However, 
the transition probability for a single particle at r; to jump to the next cell does 
become infinite in the small / limit, so in this sense, the motion becomes determini- 
stic. 

The reader can also check that this result is independent of dimensionality of 
space. Thus, we can find a representation of flow in phase space which, in the 
limit of small cell size, does become equivalent to the flow equation (8.5.3). 

Let us now consider a full phase-space formulation, including both terms of the 
flow equation. The cells in phase space are taken to be six-dimensional boxes with 
side length A in position coordinates and € in velocity coordinates. We define the 
phase-space density in terms of the total number of molecules X in a phase cell by 


f(r, v) = X(r7, vN/ 222 . (8.5.19) 


We consider transitions of two types. For simplicity, consider these only in the 
x-direction and define 


4, = (A, 0, 0). (8.5.20) 
Type 1: Flow in position space: 

X(r', v') — X(r', v') — 1 
and either 

X(r' + A,, v') — X(r' + A, V') + 1 (v, > 0) (8.5.21) 
or 

X(r' — A,, v') > X(ri — A, v') +1 (v, <0). 
Then we make the labelling A — (i, x, 1): 


N&=Y = (rt, 2)d(v', v?) 
MG*) = §(r' — A,, r)d(v', v*) (8.5.22) 


r Gx = MG) — N&*Y 
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tt. (X) = 22] 0, | X(r', v)/23 


te. (X) =0. (8.5.23) 
The form (8.5.23) is written to indicate explicitly that the transition probability is 
the product of the end surface area of the cell (A), the number of particles per 
unit volume, and the x component of velocity, which is an approximation to the 
rate of flow across this cell face. 

Consequently, assuming v, > 0, 


A@) = y [X(r? + A,, 0°) — X(r?, v)j/A (8.5.24) 
= &273u,[ fir? + A,, v’) — f(r’, v)/A (8.5.25) 
873, o flr, v). (8.5.26) 


In a similar way to the previous example, the diffusion matrix B,,, can be shown 
to be proportional to A and hence vanishes as 2 — 0. 


Type 2: Flow in velocity space: 
Define 


gx = (¢, 0, 0). (8.5.27) 
Then we have 

X(r', v') — X(r', v‘) — 1 
and either 

X(r', vi + €,)— X(r',u' + €,) 4+ 1 (A, > 0) (8.5.28) 
or 

X(ri, vf — 6) + X(r',v'— ,) +1 (A, <9). 
The labelling 1s 


A — (i, x, 2) 
N&*2) = 6(v', v*)d(r', r?) 
Me? = du! — &,, v*)d(r", r*) 


r&x2D — (dv! — &,, v) — d(v', vy O(r', r’) (8.5.29) 
tix,2(X) = C7 | Ar‘) | X(r', 0/0? 
tGx,2»(X) = 0. 


Consequently, assuming A,(r?) > 0, the drift coefficient is 


A, = [X(r*, v? + €,)A,(r?) — X(r?, v*)A(r)1/E (8.5.30) 
= OVA Ar) S(r?, 0? + 8x) — flr*, v)I/¢ (8.5.31) 


— €73A(r) o T(r, v). (8.5.32) 
Again, similar reasoning shows the diffusion coefficient vanishes as € — 0. 


Putting together (8.5.26, 32), one obtains the appropriate flow equation (8.5.3) 
in the A, €, — 0 limit. 


8.5.3 Inclusion of Collisions—the Boltzmann Master Equation 


We consider firstly particles in velocity space only and divide the velocity space 
into cells as before. Let X(v) be the number of molecules with velocity v (where the 
velocity is, of course, discretised). 

A collision is then, represented by a “chemical reaction”’, 


X(v,) + X(v,) + X(U,) + XU) - (8.5.33) 


The collision is labelled by the index (i, j, k, /) and we have (using the notation of 
Sect. 7.5) 


NE One Oa,j 
Mihk = Oa, k + Oa! (8.5.34) 
pn a 054 am Oa, as Oa,k = Oi, 


and the transition probability per unit time is taken in the form 


ti i X) = R(ij, kKI)X(v) XW) 
tin(X) = 0. 


(8.5.35) 


There are five collisional invariants, that is, quantities conserved during a colli- 
sion, which arise from dynamics. These are: 


1) the number of molecules—there are two on each side of (8.5.33); 
ii) the three components of momentum: since all molecules here are the same, this 
means in a collision 


Uv, + v, =U, + U, 5 (8.5.36) 
lil) the total energy: this means that 
vu? + v? = Ui + U}. (8.5.37) 


The quantities v, and v? are known as additive invariants and it can be shown that 
any function which is similarly additively conserved is a linear function of them 
(with a possible constant term) [8.8]. 


In all molecular collisons we have time reversal symmetry, which in this case 
implies 


R(ij, kl) = R(kl, ij) . (8.5.38) 
Finally, because of the identity of all particles, we have 
Rij, kl) = RU, kl) = R(ij, Ik), etc. (8.5.39) 


We now have a variety of possible approaches. These are: 


i) Attempt to work directly with the Master equation. 

li) System size expansion: we assume &?, the volume in phase space of the phase 
cells is large, and we can write a FPE using the Kramers-Moyal expansion. 
From (7.5.32), we can’write a FPE with a drift term 


A,X) = 21 (—6.,; = Oa,; = Oak ae bq,1)X(0,)X(v,) Rij, kl) (8.5.40) 
JX, 
and utilising all available symmetries, 
= 2 2 Raj, kl) [X(0,)X(v,) — X(U,)X(v;)] . (8.5.41) 
JK, 


The diffusion coefficient can also be deduced; 
Bal X) = 31 rrp" RY, K)X(v)X() 
and again, utilising to the full all available symmetries, 
Ba(X) = 26,.5 21 RG, KD[X(U.)X(0;) + X(Ux)X(0;)] 
+ 2 2 R(ij, ab)[X(V)X(U;) + X(U.)X(Ls)] 
—4 ps R(aj, bI[X(v.)X(v,) + X(Uo)X(v,)) - (8.5.42) 
These imply a stochastic differential equation 
dX(v,) = {2 24 R(aj, kl) [X(v,)X(v,) — X(v,)X(v,)]} dt + dWv,, t) (8.5.43) 
where 
dWiv,, t)\dW(v,, t) = dt By X) . (8.5.44) 


Neglecting the stochastic term, we recover the Boltzmann Equation for X(v,) in a 
discretised form. As always, this Kramers-Moyal equation is only valid in a small 
noise limit which is equivalent to a system size expansion, the size being €°, the 
volume of the momentum space cells. 


lil) Poisson representation: the Boltzmann master equation is an ideal candidate for 
a Poisson-representation treatment. Using the variable a(v,) as usual, we can follow 
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through the results (7.7.69) to obtain a Poisson representation FPE with a drift 
term 


A,(a) = 2 5 R(Qj, kl) [a(vxa(vs) — a(v.)a(v)) (8.5.45) 
and diffusion matrix 
Bal@) = 20an 34 RY, Kla(vdalv,) — a(va)a(vi)) 
+ 22) RY, abjleUjelv,) — a(va)a(v,)] 
— ba0 3, R(aj, BNla(v.)a(v,) — a(vs)a(v)] (8.5.46) 


and the corresponding SDE is 


da(v,) = 2%, R(aj, kl) [a(v,)a(v,) — a(v,)a(v,)] + dW(v,, t) (8.5.47) 
where 
dW(v,, t)dW(v,, t) = dt B,,(@) . (8.5.48) 


As emphasised previously, this Poisson representation stochastic differential equation 
is exactly equivalent to the Boltzmann master equation assumed. Unlike the Kramers- 
Moyal or system size expansions, it is valid for all sizes of velocity space cell €. 


iv) Stationary solution of the Boltzmann master equation: we have chosen to 
write the Boltzmann master equation with ¢,7,(X) zero, but we can alternatively 
write 


tinn(X) = ti aX) (8.5.49) 


and appropriately divide all the R’s by 2, since everything is now counted twice. 
The condition for detailed balance (7.5.18) is trivially satisfied. Although we 
have set ¢~(ij, k/) = 0, the reversed transition is actually given by ¢*(ki, ij). Hence, 


Ki aa = Kin kd =— R(ij, kl) 5 (8.5.50) 


provided the time-reversal symmetry (8.5.38) is satisfied. 
Under these conditions, the stationary state has a mean (X) satisfying 


a(v,a(v,) = a(v,)a(v;) . (8.5.51) 


This means that log [a(v;)] is additively conserved and must be a function of the 
Invariants (8.5.36, 37) Hence, 


a(v) = exp[—(v — U)*/mkT]. (8.5.52) 


Here m is the mass of the molecules and U and kT are parameters which are of 
course identified with the mean velocity of the molecules, and the absolute tem- 
perature multiplied by Boltzmann’s constant. 
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The stationary distribution is then a multivariate Poisson with mean values 
given by (8.5.52). The fluctuations in number are uncorrelated for different veloci- 
ties. 


8.5.4 Collisions and Flow Together 


There is a fundamental difficulty in combining the treatment of flow and that of 
collisions. It arises because a stochastic treatment of flow requires infinitesimally 
small cells, whereas the Boltzmann master equation is better understood in terms 
of cells of finite size. This means that it is almost impossible to write down expli- 
citly an exact stochastic equation for the system, except in the Poisson representa- 
tion which we shall shortly come to. 

To formally write a multivariate phase-space master equation is, however, 
straightforward when we assume we have phase-space cells of finite size (7&3. We 
simply include all transitions available, i.e., those leading to flow in position space, 
flow in velocity space and collisions. The resultant Master equation thus includes 
the possible transitions specified in (8.5.22, 23, 29) and in a modified form (8.5. 
34, 35). Here, however, we have collisions within each cell defined by the transition 
probability per unit time 


ti a(X) = O(r;, 7;) Ore, 11) O(t;, Pe) R(ij, KDX;X;, . (8.5.53) 


For finite 133, there will be an extra stochastic effect arising from the finite cell 
size as pointed out in Sect. 8.5.2, which disappears in the limit of small A and & 
when transfer from flow is purely deterministic. 

The resulting master equation is rather cumbersome to write down and we 
shall not do this explicitly. Most work that has been done with it has involved a 
system size expansion or equivalently, a Kramers-Moyal approximation. The 
precise limit in which this is valid depends on the system size dependence of R(ij, k/). 
The system size in this case is the six-dimensional phase-space volume /?é?. In 
order to make the deterministic equation for the density, defined by 


f(t, UV) = X(t, VN/VCO . (8.5.54) 
independent of cell size, R(ij, k/) as defined in (8.5.53) must scale like (A*é?)‘, i.e. 

Rij, kl) = Rij, kl) (A2E*)* . (8.5.55) 
This is interpretable as meaning that R(ij, kJ) is the mean collision rate per phase- 
space volume element in each of the arguments. 


Taking the conservation law (8.5.36, 37) into account, we can then write 


R(ij, kl) = 8a[(v, — v,)’, (©; — U,) (Ux — v,)] 
x (v0? + v} — vi — UP) d(v; + UV, — Uy — U,) (8.5.56) 


and we have assumed that a is a function only of scalars. [The fact that o 1s only a 
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function of (v, — v,)* and (vu, — U,)-(U, — U;) is a result of scattering theory, and it 
follows from invariance with respect to the Galilean group of transformations, i.e., 
rotational invariance, and the fact that the laws of physics do not depend on the 
particular choice of unaccelerated frame of reference. We choose to keep the 
dependence in terms of scalar products for simplicity of expression in the fluctua- 
tion terms. ] 


a) Kramers-Moyal Expression 
We now replace the summations by integrations according to 


(€°3) x iaen f d*r,d*v, (8.5.57) 
j 


and change the variables by 


= 0; 
= HP +9) (8.5.58) 
Uv, = 4(p—q). 


After a certain amount of manipulation in the collision term, the deterministic 
equation comes out in the form [from (8.5.3.41)] 


df(r, v) == —v-Vf(r, v) Te: A-V f(r, v) 
+ fd’y, paca — |v —v, ole, ¢-v — vy) (8.5.59) 


x (flr, + vo, — OI fir, $v + ¥, + g)] — f(r, v) flr, v,))} dt. 


The stochastic differential equation, arising from a Kramers-Moyal expansion, is 
obtained by adding a stochastic term dW(r, v, t) satisfying 


a(v — v’) f dv, ay Ola ~|v—v,|) 
Xo(@, 4-0 — OI Slr, Ho + AI Slr, HO +, + M+ SOL, OD) 
—2 | d*k 5[(v, — v,)-k] o[(¥a — Us)’ + 4k’, —U. — U4)? + 7k? 


x Lv.) vs — b) + vs) (Us — k)] | (8.5.60) 


dW(r, v, t) dW(r', v'’, t) = &(r — r’)dt x 


Using such a Kramers-Moyal method, van den Broek et al. [8.7] have been able 
to apply a Chapman-Enskog expansion and have obtained fluctuating hydrody- 
namics from it. 

The validity of this method, which depends on the largeness of the cells for the 
validity of the Kramers-Moyal expansion and the smallness of the cells for the 
validity of the modelling of flow as a birth-death process, is naturally open to 
question. However, since the Chapman-Enskog method is probably equivalent to 
the adiabatic elimination of the variables governed by the Boltzmann collision term, 
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the result of adiabatic elimination is not likely to be very sensitive to the precise 
form of this operator. Thus, the Kramers-Moyal approximation may indeed be 
sufficiently accurate, even for very small cells. 


b) Poisson Representation 
The Poisson representation stochastic differential equation can be similarly 
obtained from (8.5.45, 46). We use the symbol g(r, v) defined by 


o(r, V) = alr, v)/(A*Eé’) . (8.5.61) 
We find 


dg(r, v) = dt |—v-pd(r, v) — A-Pod(r, v) + [ dv, faq] — |v — v4) 


lq | 
x alg’, q:-(v — v IL fIr, 40 + v, — gl fir, tv + & + 4)] 
— f(r, v) flr, v,)] + dW(r, v, t), (8.5.62) 
where 
dW(r, v, t)dW(r',v’, t) = d(r — r’)dt x (3@ — v’) { d>v, a (la — |v—r,|) 


x olq’,q-(v — v,)] {dlr, 4 + v, — g)) dlr, 4(0+0, + 9g)] — d(r, vv) d(r, v,)} 


+ [4 lal —le— 21) ol¢?, a — 0 
x {dlr 4 + re’ — g)) dlr, }v+v'+q)] — dr, v) dr, v)}}. (8.5.63) 


The terms in (8.5.63) correspond to the first two terms in (8.5.46). The final term 
gives zero contribution in the limit that €3A? — 0. 

As always, we emphasise that this Poisson representation stochastic differential 
equation Is exactly equivalent to the small cell size limit of the Boltzmann Master 
equation with flow terms added. Equations (8.5.62, 63) have not previously been 
written down explicitly, and so far, have not been applied. By employing Chapman- 
Enskog or similar techniques, one could probably deduce exact fluctuating hydro- 
dynamic equations. 


9. Bistability, Metastability, and Escape Problems 


This chapter is devoted to the asymptotic study of systems which can exist in at 
least two stable states, and to some closely related problems. Such systems are of 
great practical importance, e.g., switching and storage devices in computers are 
systems which have this property. So do certain molecules which can isomerise, 
and more recently, a large number of electronic, chemical and physical systems 
which demonstrate related properties in rich variety have been investigated. 


The problems of interest are: 
i) How stable are the various states relative to each other? 
ii) How long does it take for a system to switch spontaneously from one state 
to another? 
1) How is the transfer made, t.e., through what path in the relevant state space? 
iv) How does a system relax from an unstable state? 


These questions can all be answered relatively easily for one-dimensional diffu- 
.$10n processes—but the extension to several, but few dimensions is only recent. 
The extension to infinitely many variables brings us to the field of the liquid-gas 
transition and similar phase transitions where the system can be in one of two 
phases and arbitrarily distributed in space. This is a field which is not ready to be 
written down systematically from a stochastic point of view, and it is not treated 
here. 

The chapter is divided basically into three parts: single variable bistable diffu- 
sion processes, one-step birth-death bistable systems and many-variable diffusion 
processes. The results are all qualitatively similar, but a great deal of effort must be 
invested for quantitative precision. 


9.1. Diffusion in a Double-Well Potential (One Variable) 


We consider once more the model of Sect. 5.2.7 where the probability density 
p(x, t) of a particle obeys the Fokker-Planck equation 


0, p(x, t) = 0,[U'(x)p(x, t)] + Do2 p(x, t). (9.1.1) 


The shape of U(x) is as shown in Fig. 9.1. There are two minima at a and c and in 
between, a local maximum. The stationary distribution is 


px) = WV exp[— U(x)/D] (9.1.2) 


and it is this that demonstrates the bistability. Corresponding to a, c and b are 
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Fig. 9.1. Plot of p,(x) and U(x) for a double well pot- 
ential 


two maxima, and a central minimum as plotted in Fig. 9.1. The system is thus most 
likely to be found at a orc. 


9.1.1 Behaviour for D = 0 


In this case, x(t) obeys the differential equation 


i = —U%x), xO)=xX. as 
Since 
2) _ oS = [UWP <0, 


x(t) always moves in such a way as to minimise U(x), and stops only when U’(x) 
is zero. Thus, depending on whether x, is greater than or less than b, the particle 
ends up at c or a, respectively. The motion follows the arrows on the figure. 

Once the particle is at a or c it stays there. If it starts exactly at b, it also stays 
there, though the slightest perturbation drives it to a or c. Thus, b is an unstable 
stationary point and a and ¢ are stable. There is no question of relative stability of 
aand c. 


9.1.2 Behaviour if D is Very Small 


With the addition of noise, the situation changes. The stationary state can be ap- 
proximated asymptotically as follows. Assuming U(x) is everywhere sufficiéntly 
smooth, we can write 


U(x) = U(a) + 4U'(a) (x — a)* |x — a| small 


U(c) + 4U"(c) (x — c)? |x — c| small. 


I 


(9.1.4) 


I 
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If D is very small, then we may approx'mate 


px) = Wexp [—U(a)/D — 4U"(a) (x — a)?/D] |x — a| small 
= Wexp [—U(c)/D — 4U"(c) (x — c)*/D] |x —c| small (9.1.5) 


=~ 0 (elsewhere) 


so that 

MY = eC UOIP /IZDIUVa) + CUO!” /2nD/[ UC) - (9.1.6) 
Suppose, as drawn in the figure, 

U(a) > U(c). (9.1.7) 


Then for small enough D, the second term is overwhelmingly larger than the first 
and £~' can be approximated by it alone. Substituting into (9.1.5) we find 


U"(c) 
2nD 


= 0 (otherwise) . 


exp[—4$U%c) (x —c)/D]_ = |x—cl ~V/D 


px) = 
(9.1.8) 


This means that in the limit of very small D, the deterministic stationary state at 
which U(x) has an absolute minimum is the more stable state in the sense that in 
the stochastic stationary state, p,(x)gis very small everywhere except in its imme- 
diate vicinity. 

Of course this result disagrees with the previous one, which stated that each 
state was equally stable. The distinction is one of time, and effectively we will show 
that the deterministic behaviour is reproduced stochastically if we start the system 
at x, and consider the limit D — 0 of p(x, t) for any finite t. The methods of Sect. 
6.3 show this as long as the expansion about the deterministic equation is valid. 
Equation (6.3.10) shows that this will be the case provided U'(x,) is nonzero, or, 
in the case of any finite D, U’(x,) is of order D®° (here, D replaces the é? in Sect. 6.3.) 
This is true provided x, is not within a neighbourhood of width of order D'/? of 
a, c, or b. This means that in the case of a and c, fluctuations take over and the 
motion is given approximately by linearising the SDE around a or c. Around B, 
the linearised SDE is unstable. The particle, therefore, follows the Ornstein- 
Uhlenbeck Process until it leaves the immediate neighbourhood of x = b, at which 
Stage the asymptotic expansion in ./D takes over. 

However, for t —- co, the asymptotic expansion is no longer valid. Or, in other 
words, the tf —- co limit of the small noise perturbation theory does not reproduce 
the D — 0 limit of the stationary state. 

The process that can occur is that of escape over the central barrier. The noise 
dW(t) can cause the particle to climb the barrier at b and reach the other side. This 
involves times of order exp(— const/D), which do not contribute to an asymptotic 
expansion in powers of D since they go to zero faster than any power of Das D —0. 
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9.1.3 Exit Time 


This was investigated in Sect. 5.2.7c. The time for the particle, initially near a, to 
reach the central point 5D is 


T(a — b) = n{| U%b)| U"(a@)}""? exp {[U(b) — U(a)]/D} (9.1.9) 


(half as long as the time for the particle to reach a point well to the right of 5). For 
D — 0, this becomes exponentially large. The time taken for the system to reach 
the stationary state will thus also become exponentially large, and on such a time 
scale development of the solutions of the corresponding SDE in powers of D!’? 
cannot be expected to be valid. 


9.1.4 Splitting Probability 


Suppose we put the particle at x,: what is the probability that it reaches a before c, 
or c before a? This can be related to the problem of exit through a particular end of 
an interval, studied in Sect. 5.2.8 We put absorbing barriers at x = a and x = ¢, 
and using the results of that section, find z, and z,, the “‘splitting probabilities” for 
reaching aor c first. These are (noting that the diffusion coefficient is D, and hence 
independent of x): 


Ma(Xo) = j dx pax)" f dx pcx) 


ag 


(9.1.10) 


iv] 


nati) =| P dx paooy|| f dx poo) 


a 


The splitting probabilities z, and x, can be viewed more generally as simply the 
probability that the particle, started at x , will fall into the left or right-hand well, 
since the particle, having reached a, will remain on that side of the well for a time 
of the same order as the mean exit time to b. 

We now consider two possible asymptotic forms as D — 0. 


a) x, a Finite Distance from 5 
We first evaluate 


f dx pix)? = f dx NV exp [U(x)/D]. (9.1.11) 


This is dominated by the behaviour at x ~ 5. An asymptotic evaluation is cor- 
rectly obtained by setting 


U(x) = U(b) — 4] U%(b)| /(b — x)*. (9.1.12) 


As D — 0, the limits at x = a, c effectively recede to +co and we find 
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2nD- 


i Ub) exp [U(b)/D]. (9.1.13) 


f dx pay ~ vw ae 
Now suppose x) < 5. Then f dx p(x) can be evaluated by the substitution 


y = U(x) (9.1.14) 


with an inverse x = W(y) and is asymptotically 


U(xg) 
WAP eo? Wiy)dy ~ WD ec WU (xe)] 


(9.1.15) 
Thus, 

n= gig (LBB exp [MEd = we cis 
and 

tg = 1 —T,. (9.1.17) 


We see here that the splitting probability depends only on x, and b. Thus, the prob- 
ability of reaching c in this limit is governed entirely by the probability of jumping 
the barrier at b. The points at a and c are effectively infinitely distant. 

b) x, Infinitesimally Distant from 5 

Suppose 


X» =b—y JSD. (9.1.18) 


In this case, we can make the approximation (9.1.12) in both integrals. Defining 
[9.1] 


erf(x) = ri . j dt e-?, (9.1.19) 
we find 

ne =1—m,~ 4{1 — erflvo/ TOON (9.1.20) 

= a 2 ert | (b = x.) (LEO (9.1.21) 


Equation (9.1.21) is the result that would be obtained if we replaced U(x) by its 
quadratic approximation (9.1.12) over the whole range. 
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c) Comparison of Two Regions 

The two regions give different results, and we find that a simple linearisation of the 
SDE [which is what replacing U(x) by a quadratic approximation amounts to] gives 
the correct result only in the limit of large D and in a region of order of magnitude 
./ D around the maximum b. 


9.1.5 Decay from an Unstable State 


The mean time for a particle placed at a point on a potential to reach one well or the 
other is an object capable of being measured experimentally. If we use (9.1.1) for 
the process, then the mean time to reach a or c from b can be computed exactly 
using the formulae of Sect. 5.2.8. The mean time to reach a from b is the solution of 


— U"(x)0,[2.(x)T(a, 0) + Do2{n,(x)T (a, x)] = —n(x) (9.1.22) 
with the boundary conditions 
m,(a)T (a, a) = 2,(c)T(a, c) = 0 (9.1.23) 


and z,(x) given by (9.1.10) 

The solution to (9.1.22) is quite straightforward to obtain by direct integration, 
but it is rather cumbersome. The solution technique is exactly the same as that used 
for (5.2.158) and the result is similar. Even the case covered by (5.2.158) where we 
do not distinguish between exit at the right or at the left, is very complicated. 

For the record, however, we set down that 


x! 


me(x) f dx! px!" i Tq(Z)ps(z)dz— 14(X) f dx'p(x')"' J 1,(z)p,(z)dz 


T(a, x) = Du,(x) 4 


(9.1.24) 


where one considers that z,(x) is given by (9.1.10) and p,(z) by (9.1.2). It can be 
seen that even for the simplest possible situation, namely, 


U(x) = —-tkx? , (9.1.25) 


the expression is almost impossible to comprehend. An asymptotic treatment Is 
perhaps required. Fortunately, in the cases where p,(z) is sharply peaked at a and c 
with a sharp minimum at b, the problem reduces essentially to the problem of 
relaxation to a or to c with a reflecting barrier at b. 

To see this note that 


1) the explicit solution for z,(x) (9.1.10) means 


m(x)=1 (x <b) 
=t (x=b) (9.1.26) 
a © (x > b) 
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and the transition from | to 0 takes place over a distance of order ./D, the width 
of the peak in p,(x)'. 
ii) In the integrals with integrand z,(z)p,(z), we distinguish two cases. 


x' > b: in this case the estimates allow us to say (9.1.27) 


x! 


f 2.(z)p(z)dz = n,/2 , (9.1.29) 


a 


b 
where n, = f p,(z)dz and represents the probability of being in the left-hand well. 


However, when x’ < a, we may still approximate z,(z) by 1, so we get 


x! 


f x.(z)p.(z)dz = ( pz)dz = 8 f p(z)dz . (9.1.30) 
2 / 


a 


Substituting these estimates into (9.1.24) we obtain 
b b 
T(a, b) = D™ { dx'p,(x)"' [ p.(z)dz . (9.1.31) 


which is the exact mean exit time from 5 to ain which there is a reflecting barrier at 
b. Similarly, 


T(c, b) = D7 f dx’ p,(x)7! i p(z)dz . (9.1.32) 
7) b ‘g 


9.2 Equilibration of Populations in Each Well 


Suppose we start the system with the particle initially in the left-hand well at some 
position x, so that 


P(x, 0) = d(x—x;). (9.2.1) 


If D is very small, the time for x to reach the centre is very long and for times small 
compared with the first exit time, there will be no effect arising from the existence 
of the well at c. We may effectively assume that there is a reflecting barrier at b. 

The motion inside the left-hand well will be described simply by a small noise 
expansion, and thus the typical relaxation time will be the same as that of the 
deterministic motion. Let us approximate 


U(x) = U(a) + 4U"(a)x’ ;xW 


then the system is now an Ornstein-Uhlenbeck process and the typical time scale is 
of the order of [U"(a)]“'. 

Thus, we expect a two time scale description. In the short term, the system 
relaxes to a quasistationary state in the well in which it started. Then on a longer 
time scale, it can jump over the maximum at b and the long-time bimodal stationary 
distribution is approached. 
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9.2.1 Kramers’ Method 


In 1940, Kramers [9.2] considered the escape problem from the point of view of 
molecular transformation. He introduced what is called the Kramers equation 
(Sect. 5.3.6a) in which he considered motion under the influence of a potential 
V(x) which was double welled. In the case of large damping, he showed that a 
corresponding Smoluchowski equation of the form found in (6.4.18) could be 
used, and hence, fundamentally, the escape problem 1s reduced to the one pre- 
sently under consideration. 

Kramers’ method has been rediscovered and reformulated many times [9.3]. 
It will be presented here in a form which makes its precise range of validity reason- 
ably clear. : 

Using the notation of Fig. 9.1, define 


M(x, t) = [ dx’p(x’, t) (9.2.2) 


N,(t) = 1— NA(t) = MO, t) 
and (9.2.3) 
N(t) = (c — a) p(X, ft). 


Further, define the corresponding stationary quantities by 


n,=1—n, = J p(x’)dx' 


(9.2.4) 
ny = (c—a) p,(X0) 
From the FPE (9.1.1) and the form of p,(x) given in (9.1.2) we can write 
d,M(x, t) = D p,(x)0.l p(x, t)/p2) (9.2.5) 
which can be integrated to give 
d, {dx M(x, #)/p(x) = Di p%o» 1)1P.0%) — Pa 1)Ip.Ca)] (9.2.6) 


This equation contains no approximations. We want to introduce some kind of 
approximation which would be valid at long times. 

We are forced to introduce a somewhat less rigorous argument than desirable 
in order to present the essence of the method. Since we believe relaxation within 
each well is rather rapid, we would expect the distribution in each well (in a time 
of order of magnitude which is finite as D — 0) to approach the same shape as the 
stationary distribution, but the relative weights of the two peaks to be different. 
This can be formalised by writing 


P(x, t) = pX)NAt)/ng x <b 


(9.2.7) 
= P(x)NAt)/n. x>b. 
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This would be accurate to lowest order in D, except in a region of magnitude ./D 
around b. 
If we substitute these into (9.2.6), we obtain 


(Xo) N,(t) = D[No(t)/no — N,(t)/na] 


u(x,)N(t) = D[No(t)/ — N.(t)/ne vee) 
with 
(Xo) = f pL — w(xldx 
; (9.2.9) 
(Xo) = J p.{x)[1 — w(x)]dx 
and 
W(X) = nef p(2dz x <b 
* (9.2.10) 


=n! { p.(2)dz x>b. 
5 


Note that if x is finitely different from a or c, then w(x) vanishes exponentially as 
D —-0, as follows directly from the explicit form of p,(x). Hence, since x in both 
integrals (9.2.9) satisfies this condition over the whole range of integration, we can 
Set 


w(x) = 0 


and use 


K(x0) = f p(x)! dx 
. (9.2.11) 
M(Xo) = J p(x)" ax. 


a) Three State Interpretation 
Equations (9.2.8) correspond to a process able to be written as a chemical reaction 
of the kind 


. Xe. (9.2.12) 


except that there is no equation for No, the number of X>. By noting that 
N, + N. = 1, we find that 


No(t) = mo[U(Xo)Na(t) + K(Xo)NCt)M[(%o) + HM%)I - (9.2.13) 
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This is the same equation as would be obtained by adiabatically eliminating the 
variable N,(t) from (9.2.8) and the further equation for N,(t) 


No(t) = D{N,(t)/[nae(%o)] + N.(t)/[neu(x0)]} 
— Not) {[nor(%o)]7! + [noee(xo)]'} - (9.2.14) 


Since 
Ng = p.(Xo) (C — a) = VY exp[— U(x,)/D] (c — a). (9.2.15) 


we See that the limit D —- 0 corresponds to n, — 0, and hence the rate constant in 
(9.2.14) multiplying N(t), becomes exponentially large. Hence, adiabatic elimina- 
tion is valid. 

This three-state interpretation is essentially the transition state theory of 
chemical reactions proposed by Eyring [9.4]. 


b) Elimination of Intermediate States 
Eliminating N,(t) from (9.2.8) by adding the two equations, we get 


Nt) = —N.(t) = raNa(t) + reNe(t) (9.2.16) 
with 
r, = D[n, f dx pix | = Din, { dx px) J"! , (9.2.17) 


where r, and r, are independent of x). Thus, the precise choice of x, does not 
affect the interpeak relaxation. 
Since N, + N, = 1, the relaxation time constant, 1,, is givenby 


T= Metre = e 


a ey (9.2.18) 
nn, | dx p(x)" 


c) The Escape Probability Per Unit Time for a particle initially near a to reach x, 
is the decay rate of N,(t) under the condition that an absorbing barrier Is at xo. 


This means that in (9.2.8) we set N,(¢) = 0 [but note that p,(x) 1s defined by (9.1.2)]. 
Similar reasoning gives us 


N,(t) = —DN,,(t)/ngk(%o) (9.2.19) 


so that the mean first passage time is given by 
t, =n,D" { dx p(x. (9.2.20) 


This result is essentially that obtained in (5.2.166) by more rigorous reasoning. 
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d) Dependence of Relaxation Time on Peak Populations 

Equation (9.2.18) looks at first glance like a simple formula relating the relaxation 
time to n, and n, = 1 — n,. One might think that all other factors were indepen- 
dent of 7, and t, oc n,(1 — n,). However, a more careful evaluation shows this is 
not so. If we use the asymptotic evaluation (9.1.13) we find 


— NgN, 2nD 
T= “WTA! TEAC Toy exp [U(b)/D] (9.2.21) 


and similarly, “can be evaluated asymptotically by taking the contribution from 
each peak. We obtain 


NM = JS 2nD (Ua)? exp [— U(a)/D] + [U"(c)]"” exp [— U(c)/D]} (9.2.22) 


and similarly, by definition (9.2.4) of n, and n,, 


sent es U(c) — U 
n/n, = /U"(c)/U"(a) exp poe ee) (9.2.23) 
After a certain amount of algebra, one can rewrite (9.2.21) as 
T, = 2n H(b; a, c)[n,n,]'"* (9.2.24) 


with H(b; a, c) a function which depends on the height of U(b) compared to the 
average of U(a) and U(c): explicitly, : 


H(b; a, c)=[| U"(b) |7"/2U"(a)-"4U"(c)“""4] exp etd . (9.2.25) 


Pad 


9.2.2 Example: Reversible Denaturation of Chymotrypsinogen 


Chymotrypsinogen is a protein which can be transformed into a denatured form 
by applying elevated pressures of up to several thousand atmospheres, as demons- 
trated by Hawley [9.5]. Presumably the molecule collapses suddenly if sufficient 
pressure is imposed. 

A somewhat unrealistic, but simple, model of this process is given by the equa- 
tion. 


dx = ——) dt + J = dW(t) , (9.2.26) 


where x is the volume of a molecule, U(x) is the Gibbs Free energy per molecule and 
kT/y takes the place of D. Here y is a friction constant, which would arise by an 
adiabatic elimination procedure like that used to derive the Smoluchowski equa- 
tion in Sect. 6.4. The stationary distribution is then 


p(x) = SM exp [—U(x)/kT] (9.2.27) 


as required by statistical mechanics. 
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The explicit effect of the variation of pressure is included by writing 
U(x) = U(x) + x dp (9.2.28) 


where op is the pressure difference from the state in which , and n, are equal. The 
term xdp changes the relative stability of the two minima and is equivalent to the 
work done against the pressure dp. From (9.2.23) this requires U,(x) to satisfy 


JU ((a) exp [Uo(a)/kKT] = /UG(c) exp [U.(c)/kT] . (9.2.29) 


The maxima and minima of U(x) are slightly different from those of U,(x). If we 
assume that higher derivatives of U,(x) are negligible, then the maxima and minima 
of U(x) are at points where U’(x) = 0 and are given by a+ da, b+ 6b, c + dc, 
where 


5a = —dp/U%(a) = B,dp | 
5b = dp/|U%(b)| = B.dp (9.2.30) 
Se = —dp/Us(c) = B.dp - | 


We thus identify 8, and 8, as the compressibilities 0x/dp of the states a and c, which 
are negative, as required by stability. The quantity £, is some kind of incompressibi- 
lity of the transition state. Since this is unstable, f, is positive. 

The values of U(a), U(b), U(c) of these minima are 


U(a + 6a) = U,(a + da) + (a + da)dp 
= U,(a) + adp + $B,(dp)’ 


(9.2.31) 
U(b + 6b) = Ub) + bép + $B.(dp)’ 
U(c + dc) = U,(c) + cdp + t Bop) 
and from (9.2.23) we obtain 
(9.3.32) 


This formula is exactly that obtained by thermodynamic reasoning and fits the 
data well. 

The relaxation time t, to the stationary distribution has also been measured. 
We compute it using (9.2.24, 25). We find that 


£-(0) = y'nBy!exp | =] 


9.2.33 
a+c—2b ( ) 


OKT 


1,(6p) = (ngn.)'!22-(0) exp | — dp + SE-B. —48.—48.)| 
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Notice that, in principle, a and c, the volumes of the two states and £, and £,, their 
compressibilities, are all measurable directly. The transition state data b, U(b) and 
B, are left as free parameters to be determined from lifetime measurements. Of 
course, the quadratic terms will only be valid for sufficiently small dp and for appli- 
cations, it may be necessary to use a more Sophisticated method. 

In Fig. 9.2, t,(dp) and n,/n, are plotted for a set of possible values of parameters, 
as computed from (9.2.33). 
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Notice that the equilibration time reaches a maximum near the point at which 
natural and denatured forms are in equal concentration. Some skewing, however, 
can be induced by making the potential asymmetric. Hawley, in fact, notes that 
measurements in this region are limited by “instrumental stability and the patience 
of the investigator.” 

Finally, the curves with zero compressibility are given for comparison. The 
difference is so large at the wings that it is clear that the quadratic correction 
method 1s not valid for the t,(dp) curve. However, the corrections almost cancel for 
the n,/n, curve. A realistic treatment in which more variables are included preserves 
the qualitative nature of this description, but permits as well the possibility B, < 0, 
which is not possible here, as is shown in [9.9]. 


9.2.3 Bistability with Birth-Death Master Equations (One Variable) 


The qualitative behaviour of bistable systems governed by one-step birth-death 
Master equations is almost the same as that for Fokker-Planck equations. 

Consider a one step process with transition probabilities t*(x), t-(x) so that 
the Master equation can be written as 


AAS = I(x + 1, 1) — I(x, 2) (9.2.34) 


9.2 Equilibration of Populations in Each Well 355 
with 
J(x, t) = t7(x)P(x, t) — t*(x—1)P(x—1, ft). (9.2.35) 


Suppose now that the stationary distribution has maxima at a, c and a minimum 
at b, and in a similar way to that in Sect. 9.2.1, define 


M(x, t) = PC, 1) (9.2.36) 
N,(t) = 1— NAt) = MO, t) (9.2.37) 


and if x, is a point near B, 
Not) = Pet): (9.2.38) 


The corresponding stationary quantities are 
b-1 
n,=1l—n,=)>> P,(z) (9.2.39) 
z=0 


Nig = P.(X5)'s (9.2.40) 
We now sum (9.2.34) from 0 to x — | to obtain 


ome = 96.4) (9.2.41) 


[since J(O, t) = 0]. We now use the fact that the stationary solution P,(x) is in a 
one-step process obtained from the detailed balance equation 


t~-(x)P,(x) = tt(x— 1) P,(x— 1) (9.2.42) 
to introduce an “integrating factor’ for (9.2.41). Namely, define 

B(x, t) = P(x, t)/P,(x) (9.2.43) 
then (9.2.41) can be written 


OM) Pox @)NBC 1) — BOI, 0) (9.2.44) 


so that 


d x09 [ M(z, t) 
dt 1; para = B(Xo, t) — Ba, t) 


_ P(x, t) _ PCa, t) 


P(Xo) P,(a) 


(9.2.45) 
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Equation (9.2.45) is now in almost precisely the form of (9.2.6) for the corres- 
ponding Fokker-Planck process. It depends on the solution being obtained via a 
detailed balance method (9.2.42). 

We make the same assumptions as Kramers; namely, we assume that only rela- 
xation between peaks is now relevant and write 


P(x, t) = P,(x)N(t)/n, x<b 


(9.2.46) 
= PO)N(t)in, x> 6 
and obtain relaxation equations exactly like those in (9.2.8) 
K (x0) N,(t) = No(t)/Mo — Na(#)/Me (9.2.47) 
U(X) N(t) = No(t)/no — N.(t)/n. 
where 
K(x) = 3 (PZ) OIL — v2) 
oar (9.2.48) 
uo) = >) (PE) @U — ve) 
with 
¢ 
y(z) =n," 3 PY) z<b 
a (9.2.49) 
=n;! St PW) z>b 


The only significant difference is that D appears on the right of (9.2.8) but is here 
replaced by a factor ¢~(z)~! in the definitions of x(x.) and p(x). 

All the same approximations can be made, the only difficulty being a precise 
reformulation of the D—- 0 limit, which must here correspond to a large number 
limit, in which all functions change smoothly as x changes by +1. This is just the 
limit of the system size expansion, in which a Fokker-Planck description can be 
used anyway. 

We shall not go into details, but merely mention that exact mean exit times 
are obtainable by the method of Sect. 7.4. By adapting the methods of Sect. 5.2.8 
to this system one finds the splitting probabilities that the system initially at xo, 
reaches points a, c are 


c 


m= (> POCO Ore) 


(9.2.50) 
Te = > (PEM OM > [P@e-@r} ; 


Thus, for all practical considerations one might just as well model by means of a 
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Fokker-Planck description. It is rare that one knows exactly what the underlying 
mechanisms are, so that any equation written down can be no more than an 
educated guess, for which purpose the simplest is the most appropriate. 


9.3 Bistability in Multivariable Systems 


There is a wide variety of possibilities when one deals with multivariable systems. 
If the system is described by a Master equation, the possible variety of kinds of 
transition and state space is so bewilderingly rich, that one can hardly imagine 
where to start. However, since we saw in Sect. 9.2.4 that a Master equation descrip- 
tion 1s not very different from a Fokker-Planck description, it seems reasonable to 
restrict oneself to these, which, it turns out, are already quite sufficiently compli- 
cated. 

The heuristic treatments of these problems, as developed mainly by the phy- 
sicists Langer, Landauer and Swanson [9.6] are now in the process of being made 
rigorous by mathematical treatments by Schuss and Matkowsky [9.7] and others. 
The first rigorous treatment was by Ventsel and Freidlin [9.8] which, however, 
does not seem to have attracted much attention by applied workers since the rigour 
is used only to confirm estimates that have long been guessed, rather than to 
give precise asymptotic expansions, as do the more recent treatments. 

We will consider here systems described in a space of / dimensions by a Fokker- 
Planck equation which is conveniently written in the form 


0,p =V-[—v(x)p + eD(x)-Vp] (9.3.1) 


whose stationary solution ts called p,(x) and which is assumed to be known in much 
of what follows. It can, of course, be estimated asymptotically in the small e limit 
by the method of Sect. 6.3.3. 


9.3.1. Distribution of Exit Points 
We will treat here only a simplified case of (9.3.1) In which D(x) is the identity: 
D(x) =1. (9.3.2) 


This does conceal features which can arise from strongly varying D but mostly, the 
results are not greatly changed. 

We suppose that the system is confined to a region R with boundary S, and that 
the velocity field u(x) points inwards to a stationary point a. The problem is the 
asymptotic estimate of the distribution of points 6 on § at which the point escapes 
from R. We use (5.4.49) for (6, x) (the distribution of escape points on S, starting 
from the point x), which in this case takes the form 


v(x)-V2(b, x) + eV?x(b, x) = 0 (9.3.3) 


with boundary condition 
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n(b,u¥)=5(6b—u) (ueS). (9.3.4) 


An asymptotic solution, valid for e — 0, is constructed, following the method of 
Matkowsky and Schuss [9.7]. 


a) Solution Near x = w and in the Interior of R 
Firstly one constructs a solution valid inside R. For ¢ = 0 we have 


v(x)-Vx(b, x) = 0 (9.3.5) 


which implies that 2(6, x) is constant along the flow lines of u(x), since it simply 
states that the derivative of 2(6, x) along these lines is zero. Since we assume all the 
flow lines pass through a, we have 


n(b, x) = x(6, a) (9.3.6) 


for any x inside. However, the argument ts flawed by the fact that v(a) = 0 and 
hence (9.3.5) is no longer an appropriate approximation. 

We consider, therefore, the solution of (9.3.3) within a distance of order ./¢ 
of the origin. To assist in this, we introduce new coordinates (z, y,) which are 
chosen so that z measures the distance from a, while the y, are a set of / — 1 tangen- 
tial variables measuring the orientation around a. 

More precisely, choose z(x) and y,(x) so that 


U(x) -Vz(x) = —2z(x) 
U(x) -Vy(x) = 0 (9.3.7) 
z(a) = 0. 
The negative sign in the first of these equations takes account of the fact that a is 
assumed stable, so that u(x) points towards a. Thus, z(x) increases as x travels 


further from a. 
Thus, we find, for any function f, 


rf=rAnL4 Soyo Z of (9.3.8) 
ang hence: 

v(x)-Pn = —22 (9.3.9) 
and 


V*x = V2z(x)- p2(x) 5+ 22. V z(x)- ass a 


F 2 Vy(x)-Vy,(x) = rs Vz (x) + pa a os ; (9.3.10) 
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We now evaluate z asymptotically by changing to the scaled (or stretched) variable 
& defined by 


z=—CV Ee. (9.3.11) 
Substituting (9.3.8-11) into (9.3.3) we find that, to lowest order in e, 


On O71 
— gE ae | 


where (9.3.12) 
H = V2z(a)-Vz(a). 


We can now Solve this equation getting 
z/Ve 
mb, x)= C, { dC exp (¢’/2H) + n(6, a). (9.3.13) 
0 


Because H Is positive, we can only match this solution for z with the constancy of 
n(b, x) along flow lines for x # a if C, = 0. Hence, for all x on the interior of R, 


n(b, x) = n(5, a). (9.3.14) 


[Notice that if v(x) points the other way, 1.e., is unstable, we omit the negative sign 
in (9.3.9) and find that (6, x) is given by (9.3.3) with H —- —d, and hence in a 
distance of order ./ ¢ of a, x(b, x) changes its value]. 


b) Solution Near the Boundary S 
We consider, with an eye to later applications, the solution of the slightly more 
general equation 


U(x)-Vf(x) + eV f(x) = 0 
with (9.3.15) 


fu) =stu) (ue S) 


of which the boundary value problem (9.3.4) is a particular case. Two situations 
arise. 


1) v-v(x) # 0 on S or anywhere in R except x = aa which is stable: clearly, in any 
asymptotic method, the boundary condition (9.3.15) is not compatible with a 
constant solution. Hence, there must be rapid changes at the boundary. 

Near a point # on S we can write 


v(u)-Vf(x) + eV72f(x) = 0. (9.3.16) 


Near S it is most convenient to introduce v(u), the normal (pointing out) at w to S, 
and to define a variable p by 


360 9. Bistability, Metastability, and Escape Problems 
x =u — Epv(K) (9.3.17) 


and other variables y, parallel to S. 
Then to lowest order in e, (9.3.16) reduces to (at a point near w on S) 


[v-v(u)] ~~ of ras H(u) 5 iam (9.3.18) 


with A(u) = vw? = |. 
The solution is then 


(x) = g(u) + C,(u) {1 — exp [—v-v(u)p}} . (9.3.19) 
As p — ©, we approach a finite distance into the interior of R and thus 

I(x) — g(u) + Ci\(u) = C, (9.3.20) 
from the analysis in (a), so | 

C\(u) = C, — g(u). (9.3.22) 
One must now fix C,, which is the principal quantity actually sought. 

This can be done by means of Green’s theorem. For let p,(x) be the usual 


stationary solution of the forward Kokker-Planck equation. We know it can be 
written 


p(x) = exp |— + [g(x) + 0) (9.3.23) 


as has been shown in Sect. 6.3.3. We take (9.3.16), multiply by p,(x) and integrate 
over R. Using the fact that p,(x) satisfies the forward Fokker-Planck equation, this 
can be reduced to a surface integral 


0 = f dx p,(x)[v(x)-Pf(x) + eV*f(x)] (9.3.24) 


= J aS {p,(x)¥-v(x) f(x) + elp,(x)¥-Vf(x) — f(x)¥-Vp,(x)]} . (9.3.25) 


Noting that, to lowest order in eé. 


v-Vf(x) = — -- = —vp-u(x)[Cy — 2(x)] (9.3.26) 
and 
v-Vp(x) = — = ¥-79(x) exp [—d(x)/e] , (9.3.27) 


we deduce that 


J AS e~ $)/e [2y. v(x) + v-Vd(x)]2(x) 
[ dS e-#@r y. u(x) 


(9.3.28) 


Recalling that for this problem, g(#) = 5,(4@ — 5), we find that, if x is well in the 
interior of R, 


n(x, b) = C _ F% [20-v(b) + ¥-V(6)] 


f dS e-#™e v- u(x) (9.3.29) 
S 


We see here that the exit distribution is essentially exp [—4(b)/e], 1.e., approximately 
the stationary distribution. If the Fokker-Planck equation has a potential solution, 
then 


v(b) = —V4(b) (9.3.30) 
and 
n(x, b) = e $e y.v(b)/[ { dS e-$” v-v(x)] 


and we simply have a kind of average flow result. 


1) v-u(x) = 0 on S: this problem is more directly related to bistability, since 
midway between two stable points a and ec, a curve ¥-u(x) = 0 which separates the 
two regions and is known as a separatrix is expected. 

The method is much the same except that near uw on S, we expect 


v-U(x) ~ v-(x — u)x(u). (9.3.31) 


where x(u#) is a coefficient which depends on u(x) and is assumed to be nonzero. 
The situation is now like that at x = a and it is appropriate to substitute 


x=u—j/¢e pv(n) (9.3.32) 


and to lowest order in ¢ (9.3.16) reduces to (at a point near uw on §) 


k(u) pz + 5 = 0 (9.3.33) 
so that 
fix) = gu) + C, f dp exp [—4x(u) 7] (9.3.34) 


and letting ¢ —- co, we find 
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2 
Cy = [Co — (uy), XO), (9.3.35) 


The result for x(x, b) is now completed as before: one gets 


ew] (1 es gf x) v-(b) + »-74(6)| 


n(x, b) = ———— 
{ ds etinte, [242 yy( 


(9.3.36) 


9.3.2 Asymptotic Analysis of Mean Exit Time 


From our experience in one-dimensional systems, we expect the mean exit time 
from a point within R to be of order exp(K/e); for some K > 0, as e—0. We 
therefore define 


t(x) = exp (—K/e)T(x) (9.3.37) 


where 7(x) is the mean escape time from R starting at x and 1(x) satisfies (from 
Sect. 5.4) 


U(x)-Vt(x) + e&V?t(x) = —e*"* 


9.3.38 
t(u) = 0 ues. $ 


If this scaling is correct, then any expansion of t(x) in powers of é will not see the 
exponential, so the equation to lowest order in ¢ will be essentially (9.3.16). 

As in that case, we show that (x) 1s essentially constant in the interior of R and 
can be written as [in the case v-u(x) # 0 on S$] 


T(x) ~ C, {1 — exp [—v-v(x)p]} (9.3.39) 


near S. 
We multiply (9.3.38) by p,(x) = exp[—¢(x)/e] and use Green’s theorem to 
obtain [in much the same way as (9.3.25) but with t(x) = 0 on S] 


—e Ke [ dx e #@)/e = —[ dS e$™ [Cv-v(x)] , (9.3.40) 
R AY 


1.€., 


Co = f dx eTXt8@Ve/( dS 82" y. u(x) . (9.3.41) 
R S 


By hypothesis, Cy does not change exponentially like exp (A/e). In the numerator 
of (9.3.41) the main contribution comes from the minimum of ¢(x) which occurs at 
the point a, whereas in the denominator, it occurs at the point on S where d(x) is a 
minimum, which we shall call x,. Thus, the ratio behaves like 
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exp {[6(a) — (xo) — KY/e} . 
and hence for C, to be asymptotically constant, 
K = g(a) — d(x) (9.3.42) 


and, for x well into the interior of R, we have 


(9.3.43) 
In the case where v- v(x) = on all of S, we now have 
r(x) ~ Cy J dp exp [—Juc(u) p?] (9.3.44) 
and hence in the interior, 
(x) ~ Cy “ PT (9.3.45) 


The analysis proceeds similarly and we find, for x well in the interior of R, 


(9.3.46) 


Son ent g(0/8/ dS e-s/e- 
T(x) / Tic(w) J dx ef (x)/#/ J e 


9.3.3. Kramers’ Method in Several Dimensions 


The generalisation of Kramers’ method is relatively straightforward. We consider 
a completely general Fokker-Planck equation in / dimensions [we use P(x) for the 
probability density for notational ease] 


0,P =V-[—v(x)P + eD(x)-VP] (9.3.47) 


whose stationary solution is to be called P,(x) and can only be exhibited explicitly 
if (9.3.47) satisfies potential conditions. We assume that P,(x) has two well-defined 
maxima at a and ¢ and well-defined saddlepoint at 5 (Fig. 9.3). We assume that the 
value at the saddlepoint is very much smaller than the values at a and ec. We intro- 
duce a family of (J — 1) dimensional planes S(w), where w is a parameter which 
labels the planes. We choose S(a) to pass through a, S(b) through 6 and S(c) 
through ec. The planes S(w) are assumed to be oriented in such a way that P,(x) 
has a unique maximum when restricted to any one of them. We define, similarly 
to Sect. 9.2.1 


M[S(v)] = J dx P(x), (9.3.48) 
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Fig. 9.3. Contours of the stationary distribu- 
tion function P,(x). The plane S(w) is oriented 
so that P,(x) has a unique maximum there, 
and the curve x = u(w) (dashed line) is the 
locus of these maxima 


where L(w) is the region of space to the left of the plane S(w); then 


M[S(wv)] = oo [—v(x)P + eD(x)-VP]. (9.3.49) 


Current in stationary state is defined by 
J, = —v(x)P, + eD(x)-VP, . (9.3.50) 


Assumption I: we exclude cases in which finite currents J, occur where P, is very 
small. Because of V -J, = 0, we can write 


J, = —&-(AP,) (9.3.51) 


where A is an antisymmetric tensor. We require that A be of the some order of 
magnitude as D(x), or smaller. 
Relaxation equations are derived in two stages. Define a quantity B(x) by 


B(x) = P(x, t)/P,(x) = N,(t)/n, (x near a) (9.3.52) 
= N(t)/n. (x near c). 


This is the assumption that all relaxation within peaks has ceased. Substitute now 
in (9.3.49), integrate by parts discarding terms at infinity and obtain 


M[S(w)] = ¢ [ dS-[D(x)-VB] P,(x) (9.3.53) 
S(w) 
with 
Y(x) = D(x) + A(x). (9.3.54) 


Assumption II: P,(x) is sharply singly peaked on S(w) so we may make the approxi- 
mate evaluation 
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M[S(w)] = {e[n(w)-D(x)-V Blum) + O(w)} i) a P,(x)| (9.3.55) 


where 6(w) is expected to be very much smaller than the term in square brackets. 
Here u(w) ts the position at which P,(x) has its maximum value when restricted to 
S(w), and a(w) is the normal to S(w). 


Assumption III: the direction of m(w) can be chosen so that ®'(x)-a(w) 1s parallel 
to the tangent at w to the curve x = u(w) — without violating the other assump- 
tions. Hence, 


DY [u(w)]-n(w) = d(w)d,u(w) . (9.3.56) 


Defining now 
p(w) = J ce P,(x)| , (9.3.57) 


which is (up to a slowly varying factor) the probability density for the particle to 
be on the plane S(w) and is expected to have a two-peaked shape with maxima at 
w= aand w= c and a minimum at w = Bb. 


Assumption IV: these are assumed to be sharp maxima and minima. Neglecting 
6(w), making the choice (9.3.56) and noting 


0,U(w)-V Blu(w)] = 0, B[uw)] , (9.3.58) 
we find 
<P dw (MSO) Lewd). = Bre) — Bla). (9.3.59) 


Using the sharp peaked nature of p(w), (9.3.59) can now be approximated by 
taking the value at the peak, using (9.3.52) and 


Na, t) = M{S(b), ] (9.3.60) 


as well as defining 


(Wo) = f [p(w)] dw (9.3.61) 
we) = f [pwd , (9.3.62) 


to obtain the relaxation equations 
K(Wo)N,(t) = &d(wo)[No(t)/mo — N.(t)/na] (9.3.63) 


p(Wo)N,(t) = ed(wo)[No(t)/mo — N(t)/n,] - (9.3.64) 
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These are of exactly the same form as those in the one-variable case and all the 
same interpretations can be made. 


9.3.4 Example:Brownian Motion in a Double Potential 


We consider Brownian motion in velocity and position as outlined in Sect. 5.3.6. 
Thus, we consider the Fokker-Planck equation 


OP(x,p,t) _ ae oP ep 0*P 1 
eV pecs + UWE + 9/5 Bee aes |e (9.3.65) 
In the notation of the previous section we have 
x = (x, p) 
U(x) = (p, —U"(x) — yp) 
e= 1 
5 0 O 
D(x) = h , (9.3.66) 
P(x) = 4%, exp [—tp’ — U(x) 
MN, = (20) 1PM, 
M, = { J dx exp [—U(x)}} “ 
¢ 
Hence, we can write 
0 —!1 
v(x) = ; | -p(log P,) (9.3.67) 
Y 
and the current in the stationary state is 
0 —!]1 
J, = —vuP,+ D-yP, = —7- |, . P| (9.3.68) 
so that A exists, and 
ga - 9.3.69) 
pe 03 
Thus, Assumption I ts satisfied. 
The plane S(w) can be written in the form 
Ax + p=w. (9.3.70) 


Assumption II requires us to maximise P,(x) on this plane, 1.e., to maximise 
— }p*— U(x) on this plane. Using standard methods, we find that maxima must lie 
along the curve u(w) given by 
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u(w) = Da ~ *(w) | (9.3.71) 
Pw)! | w — Ax(w) 
where x(w) satisfies 
U'[x(w)] + 1?x(w) — Aw = 0; (9.3.72) 


whether P,(X) is sharply peaked depends on the nature of U(x). 

We now implement Assumption ITI. 

The parameter / is a function of w on the particular set of planes which satisfy 
(9.3.56). The tangent to u(w) is parallel to 


dx dx d) 
“pe get ye 9.3.73 
Fr ‘ A Tw x5 ( 
and differentiating (9.3.72) we have 
aX ry 2\—1 dA _ 9.3.74 
Fy = (U" + 2) 2 — 5 ax w) |. (9.3.74) 
The normal to (9.3.70) is parallel to (A, 1). Hence, 
0 Ifa l 
Din=(l ap | | [=e bay | (9.3.75) 
i-1 yl y—A 
and this is parallel to (9.3.73) if 
dx dx di 
“ym lp —~_7He_,y% — 2). 9.3.76 
Goll =[b— 25 — x FV ya) (9.3.76) 
We can now solve (9.3.74, 76) simultaneously, to get 
dx 1 x weep ae (9.3.77) 
dw yy |x(U" + 47) — (2Ax — w) 
dA il ae . (9.3.78) 
dw y Lx(U" + 4?) — (2Ax — w) 


The saddle point is at (x, p) = (0, 0) and thus w = 0 => x = 0. Using this in 
(9.3.77) we see that we must have 


x= wily asw =O. (9.3.79) 
Near x = 0, we write approximately 
U[x] = —4U,.x? (9.3.80) 


and substituting (9.3.79, 80) in (9.3.72), we see that 
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22 — yA + U"(0) = 0 (9.3.81) 


which determines 
ot, ae 
oy =F [E+ vy. (9.3.82) 


We now see that (9.3.78) tells us that dA/dw = 0 when w = 0. Thus, J will not 
change significantly from (9.3.82) around the saddle point, and we shall from now 
on approximate J by (9.3.82). 

Only one of the roots is acceptable and physically, this should be 4 —- co in the 
high friction limit which would give Kramers’ result and requires the positive 
sign. The other root corresponds to taking a plane such that we get a minimum of 
P,(x) on it. 

We now integrate (9.3.57) and determine d(w). Notice that d(w) must be defined 
with a(w) a unit vector. Direct substitution in (9.3.75) and using (9.3.79) 


Usa = 0)d(0) = — (0) (9.3.83) 
so that 

d(0)= wi + ay”. ‘ (9.3.84) 
Further, 


p(w) = J |dS P,(x)| = wv dx? + dp* P,(x, p) 


S (w) 


ne aap »? eae (9.3.85) 
= WEEE § dp exp|— 5 — u| 7 }]. 
An exact evaluation depends on the choice of U(x). Approximately, we use 
U(x) = Uy — 34U,x’ (9.3.86) 
and evaluate the result as a Gaussian: we get 
p(w) = C4, e~ “0 exp Eraseal (9.3.87) 
and thus 
K(0) = f plwy'dw = $05! A = 0). (9.3.88) 
“e G+ aye /U, 


Thus, from (9.2.19) adapted to the many dimensional theory, we have for the 
mean first passage time from one well to the point x = 0, 
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on 
U,’ 


A 
t) = K(0)d(O)"! = Fe; (9.3.89) 


(9.3.90) 


Comparisons with Other Results 


a) Exact One-Dimensional Mean First Passage Time (Smoluchowski’s Equation) 
One reduces Kramers equation in the large friction limit to the Smoluchowski 
equation for 


P(x, t) = ff dv P(x, »v, t), (9.3.91) 
1.e., 
aP(x,t) 1 df ,, oP 


and the exact result for the mean first passage time from x = a to x = 0 for this 
approximate equation Is 


(=) f dx exp [U(x)] i dz exp[— U(z)] . (9.3.93) 


This result can be evaluated numerically. 


b) Kramers’ Result 

This is obtained by applying our method to the one-dimensional Smoluchowski 
equation (9.3.92) and making Gaussian approximations to all integrals. The 
result is 


ae aoa, ae j= (9.3.94) 
2 


which differs from (9.3.90) for t, by the replacement A — y, which is clearly valid in 
a large y limit. In this limit, 


t= + Oy De: (9.3.95) 


c) Corrected Smoluchowski 
A more accurate equation than the Smoluchowski equation (9.3.1) is the corrected 
Smoluchowski equation (6.4.108); 


aP 1a 


Ses oP 
yall tru (x)]| UG) B + malt (9.3.96) 


Ox 


One now calculates the exact mean first passage time for this equation using 
standard theory; it 1s 
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0 x 
t3 = yf dx [1 + y?U"(x)] exp[U(x)] J dz exp [—U(z)] . (9.3.97) 


Note however, that the principal contribution to the x integral comes from near 
x = 0 so that the small correction term y-*U"(x) should be sufficiently accurately 
evaluated by setting 


U'(x) = U"(0) = —U, (9.3.98) 
in (9.3.97). We then find the corrected Smoluchowski result, 
f= (1 = 970) ea ee ts (9.3.99) 


Notice that in this limit, 
be eae) (9.3.100) 


which means that in the limit that all integrals may be evaluated as sharply peaked 
Gaussians, our result is in agreement with the corrected Smoluchowski. 


d) Simulations 
By computer simulation of the equivalent stochastic differential equations 


dx = pdt (9.3.101) 

dp = —[yp + U"(x)]dt + /2y Wit) ; (9.3.102) 
we can estimate the mean first passage time fo the plane Sp, i.e., to the line 

p= —Ax. | (9.3.103) 


The results have to be computed for a given set of potentials. In order to assess the 
effect of the sharpness of-peaking, we consider different temperatures T, i.e., we 
consider 


dx = pdt (9.3.104) 
dp = —[yp + U"(x)Jdt + /2yT dW(t) . (9.3.105) 


By the substitutions 


_, nThil2 
PP ‘rai (9.3.106) 
x—x : 
we obtain 
dx = pdt 
(9.3.107) 


dp = —lyp + V'(x, T)) + VW 2y dW(t) 
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where 
Vix, T) = U(xT'!?). (9.3.108) 


The simulations were performed with 
U(x) = ¢(x* — 1)? (9.3.109) 


and the results are shown in Fig. 9.4. They separate naturally into two sets: curved, 
or Straight lines. The best answer is the corrected Smoluchowski which agrees with 
the simulations at all temperatures, and at low temperatures, agrees with our 
method. Thus, we confirm the validity of the method in the region of validity 
expected, since low temperature corresponds to sharply peaked distributions. 


1000 
- Qne dimensional 
500+ --- Kramers 
—— Our theory 
} Mean of 300 L 
computer trials 
—-— Corrected Smoluchowski } 
100 


Mean first passage time (s) 


Fig. 9.4. Comparison of vari- 
Ous estimates of the mean exit 
time from the double well 
x potential of Sect. 9.3.4 


IIL 7. DISAVIULLY, iVAClaAdtlaVllily, alld scape ri1Ooviecills 


Notice also that the choice of the plane S as the separatrix is appropriate on 
another ground. For, near to x = 0, p = 0, we can write 


dx = pdt (9.3.110) 
dp = (—yp + U,x)dt + ./2yT dW(t). (9.3.111) 


The condition that the deterministic part of (dx, dp), namely, (p, —yp + U;x) is in 
the direction connecting the point (x, p) to the origin ts 


p x 
ee, 9.3.112 
—yp + U,x  p ( ) 
Putting p = — (x, we find 
A? — Ay — U, =0 (9.3.113) 


which is the same as (9.3.81) near x = 0. The two solutions correspond to the 
deterministic motion pointing towards the origin (+ ve root) or pointing away from 
the origin (—ve root). 

Thus, when the particle is on the separatrix, in the next time interval dt, only 
the random term dW(t) will move it off this separatrix and it will move it right or 
left with equal probability, t.e., this means that the splitting probability, to left 
or right, should be 1:1 on this plane. 

This separatrix definition also agrees with that of Sects. 9.1, 2 where the 
u(x) should be perpendicular to the formal to S. 


10. Quantum Mechanical Markov Processes 


Quantum mechanics, since the very early times in the 1920’s, has been recognised 
as a description of the world which contains an essentially statistical aspect. Hence, 
all quantum mechanics must be regarded as being some kind of stochastic process. 
However, what is essentially unique to quantum mechanics is the description in 
terms of complex probability amplitudes, the square of whose modulus gives the 
actual probability of occurrence of an event. 

The formulation of a proper quantum mechanical probability theory, or of 
quantum mechanics in terms of appropriately defined stochastic processes in this 
generalised probability theory, is not the aim of this chapter. What is of interest is 
the introduction of the reader to the rather fascinating world which straddles the 
boundaries of quantum and classical probability theory. This world is the realm of 
quantum optics and quantum electronics, where there are statistical aspects arising 
from the intrinsic quantum nature of the system, as well as fluctuations arising 
from thermal effects. We shall show how the quantum mechanics of optical systems 
can be related closely to Markov jump processes in a suitably generalised form, 
which can themselves very frequently be related by means of what are known as 
P-representations or Otherwise, as phase-space methods, to diffusion processes in the 
complex plane. These diffusion processes can describe quasiprobabilities which 
may be negative or complex, or they may define genuine positive probabilities. The 
situation is very similar to that of the Poisson representation of Sect. 7.7 which is 
itself, in fact, a restricted form of P-representation. 

We will formulate this chapter as follows. We first outline the quantum 
mechanics of the harmonic oscillator and introduce the concept of coherent states, 
which are central to the task. We then define a quantum Markov process and 
show how generalised Master equations can be derived for these, in a manner 
similar to that of the adiabatic elimination methods of Chap. 6. From these 
generalised Master equations we can sometimes develop ordinary birth-death 
Master equations, and sometimes, by using P-representations, we can develop 
Fokker-Planck equations. Both methods allow us to apply all the apparatus of 
classical stochastic processes to these quantum mechanical systems. 


10.1 Quantum Mechanics of the Harmonic Oscillator 


We describe the Harmonic oscillator in terms of creation and destruction operators 
a* and a which satisfy the commutation relations 


[a, at] = aat —ata= |] (10.1.1) 
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unimolecular 279 
Chemical reactions, transition state theory 
351 
Chymotrypsinogen, reversible denaturation of 
352 
Coherent states ‘f 
defined 375 
expansion of an operator in 378 
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from positive P-representation 416-418 


332 — 336 


Subject Index 439 
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quantum master equation 395-399 
quantum mechanics of 373 
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properties of 88-92 
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stochastic 40 
Lindeberg condition 37, 46 
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multivariable 113 
single-variable 112 
Liouville equation 53 
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Liouville operators; defined 391 
Local and global descriptions of chemical 
reaction, connection between 328 — 331 
Local and global fluctuations 320 
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Markov postulate 5, 10 
Markov process 
autocorrelation for 64-66 
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quantum mechanical 373 
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Master equation 51, 235—301 
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many-variable; Kramers-Moyal expansion 
for 266 
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Master equation (cont.) 
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Nonanticipating function; defined 86 
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P-representation 
complex; defined 410 
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Positive P-representation 

defined 410 

FPE from 416-418 
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Quantum Markov process, derivation of 
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Quantum Markov process, time correlation 
functions in 402-408 
Quantum master equation 
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Random variable 24 
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37 
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approximation by FPE 247 

Rayleigh process 144, 147 
eigenfunctions 135 

Reaction diffusion chemical equation 305 

Reaction diffusion master equation; defined 
313 

Reaction diffusion system 303 
fluctuating partial differential equations 

313 

in Poisson representation 314 

Reaction diffusion systems, divergence 
problems 327 

Reflecting boundary condition for backward 
FPE 129 

Regression theorem 65 
quantum 404-405 
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Scaling assumption in approximation of 
master equation 246 

Schrodinger’s equation 374 

SDE see Stochastic differential equation 
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Set of events 21 

Sets of probability zero 29 

Shot noise 12 
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Slaving principle 197 
Small noise expansion 
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Smoluchowski equation 197 
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boundary conditions 205 
corrected 209 
corrections to 206-210 
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for diffusion in a double well 349 
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Spectrum matrix 
and eigenfunctions 169 
with detailed balance 150 
Splitting probability in double well potential 
345 
Standard deviation 30 
Stationary 18 
Stationary distribution of FPE, asymptotic 
method for 194 
Stationary Markov process; defined 
56 
Stationary process, approach to 
61 —63 
Stationary solutions, many-variable FPE 
146 — 148 
Stationary systems 17 
Statistical mechanics 6 
Stochastic differential equation 5, 7, 14, 15, 
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as limit of nonwhite process 210-218 
connection with FPE 96 
definition and properties of 92 -— 102 
dependence of solutions on initial 
conditions 101 ¢ 
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partial 305 
small noise expansion for 
Stochastic integral 
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Ito 84 
Stratonovich 86 
Stochastic partial differential equation 305 
Stochastic process 
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as limit of nonwhite process 210-218 
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Time correlation functions 
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reaction 294-299 
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